I'm working on a pipeline that needs to remove duplicate IDs by date, i.e. I only need the most recent row for each ID.
I got this using this approach:
And this worked fine until a few days ago when it started crashing on the "Duplicates" snap. So investigating, I found that the "Confidence" option was not selected and that there was an additional option: "Group Id" that I did not remember:
I had to check the CONFIDENCE option to get it to work again.
Did you make any recent changes to this snap?
Do you have any better ideas to satisfy these requirements?