cancel
Showing results for 
Search instead for 
Did you mean: 

Finding alternative for Join/Union Snap?

aditya_gupta41
Contributor

As per my project, we are NOT ALLOWED to use join/union snap in our pipelines as according to their theory, it reduces the speed of the pipeline.

Can someone help me in finding the alternative?

Thanks in Advance

4 REPLIES 4

bojanvelevski
Valued Contributor

Hey @aditya.gupta41 ,

you can use the Gate snap and remap the data afterwards, but I seriously doubt that this will be faster than Union. You can also use In-Memory lookup, depending on what type of join do you need.

psung
Employee
Employee

It is incorrect to say Union Snap “reduces the speed of the pipeline”. Union Snap is one of the simplest and fastest Snaps.
Join Snap is also very fast if the input data stream is sorted. If not, Join Snap’s internal sorter sorts and stages all input data in local temp files until input data streams reach the end. That is why users may see it reducing the speed. However, it is one of the most widely-used Snaps with a high price performance. If all your input data is already in a database, of course, it would be faster to join it in the database.
If you need to do inner-join just two input data streams and one of them is less than approximately 100,000 Map objects, In-Memory Lookup Snap would be faster than Join Snap since it stages the right view input data in memory.
If you post why your team does not allow to use Join/Union Snaps and why they think Join/Union Snaps are slow in more details, it would be easier to help you find alternatives.

mmussitsch
New Contributor II

Hi – I know this is an old post – but wanted to ask:
Should JOIN “merge” “unsorted” behave similar to UNION?
I found a recent issue where the output data from each is totally the same, but the JOIN caused 4000+ rows of data to hang and not return, but UNION resolved it in an instance. Just curious the difference with “merge” and why this may have happened.

TIA - Melissa

Both are different.

Sample Input Data: 

virender_prajap_0-1690351998850.png

virender_prajap_1-1690352046447.png

Join - Merge : Will append data by adding columns(merge vertically). In case column name matches, column will rename. The number of O/P records will be maximum of the input views.

O/P of Join Merge

virender_prajap_2-1690352246234.png

Union: Merge two streams by adding rows(merge horizontally). In case column name matches, data will populate in same column. The number of O/P records will be sum of all input views records.

O/P of Union

virender_prajap_3-1690352903309.png