I’m currently using a Diff snap as part of a pipeline that inserts data into a SQL server for reporting purposes. I’m using a diff snap so that I can determine if a record needs to be updated or inserted, the problem I’m having is that 2 of the records I would expect to be coming through as modified records are showing up as New instead, which is causing a primary key error when I hit the insert snap. I thought that the record I set as the sort path as acting as a key for determining if the record was new or not, but now I’m not sure that’s the case. Would anyone be able to explain to me what could be going on and how I might be able to resolve it? Ideally I’d like the diff snap to only be looking at the ID column to decide if a record is new or not.
Do you have the same target table, as an original source in the diff snap? Have you tried validating and executing?
Here’s a picture of my pipeline, I’m using a CSV file from a 3rd party system as the new file and then I’m using a select statement from my database as the original document. I run them both through a mapper to ensure that the column headings both match and the sort is on the unique ID. I’ve also run the pipeline normally and in validation mode. For some reason 2 records keep coming through as insertions even though the unique ID exists in both files. All the other files flow correctly to either deleted or unmodified (there were other modified records from before, but they all properly flowed to the update output when I ran the pipeline normally earlier which can be seen here
So something about these 2 records is making the diff snap think they are new when they really are not.
Isolate just those 2 IDs, from the incoming data and from the original (Select). Validate and download the jsons from both streams, and compare those with Notepad++ or VSCode whatever you’re using.