cancel
Showing results for 
Search instead for 
Did you mean: 

How do you keep only one record of duplicated records?

mtran21
New Contributor III

My pipeline is a simple pipeline that execute an Oracle query then parse the data to a CSV file. The result of the query gives some of the duplicate records (Same ID), that because the database has those duplicates. However, I just want to keep only 1 record of each duplicate rows. How do I do that?

image

6 REPLIES 6

vineesha
Employee
Employee

Hi, unique snap eliminates duplicate documents in a document stream, such as duplicate rows.
use unique snap before mapper and let me know if you still have any issues.

mtran21
New Contributor III

My goal is to keep 1 record from each duplicate pair like this

So using Unique snap, will it remove the entire duplicate pair?

vineesha
Employee
Employee

it removes duplicates and retains one record from the duplicate rows.

mtran21
New Contributor III

I see that but the thing is the unique snap only remove duplicate when both rows are exact the same.
In my case, I only have column ID is duplicated. The other columns are not the same.