Forum Discussion

NAl's avatar
NAl
Contributor
5 years ago
Solved

Group By Field behaving mysteriously

I have a Union snap with 4 inputs where the incoming documents contain the same columns. To deduplicate the rows, I am preparing the data by using the Group by Field snap to identify those with a shared column value.

When validating the pipeline this doesn’t seem to be processing correctly as I can see the same value grouped more than once. E.g. if I have 15 incoming documents and it duplicates 2 groups, then I have 17 documents in the output.

  • Hi @NAl,

    Use the Sort snap on the same field you are grouping by. It is always a good practice to sort before using Group By, Unique, Join etc.

2 Replies

  • bojanvelevski's avatar
    bojanvelevski
    Valued Contributor

    Hi @NAl,

    Use the Sort snap on the same field you are grouping by. It is always a good practice to sort before using Group By, Unique, Join etc.

  • winosky's avatar
    winosky
    New Contributor III

    There is also the deduplicate snap if that helps.