- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-07-2021 09:34 AM
Hello,
I am working on this pipeline
The problem is with this part
I copy my result set.
I keep my first output
And aggregate my second output
And finally I want to take those aggregate results using a lookup.
This make my pipeline run endlessly. And if I remove the lookup (or join) and write to two different files it takes less than 2 minutes.
I think may be its because the two outputs are the image of the same result set.
Could you please tell me if you have seen this issue before and how to treat it.
Thank you
Solved! Go to Solution.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 03:15 PM
Hi Wassim,
This is actually a known issue (SWAT-3096) that we’re working on a fix for. It happens when there are at least 1024 records being copied by the Copy snap, for reasons that are a bit difficult to explain.
Until we have a fix, there are at least three workarounds:
- Swap the order of the inputs to the Lookup snap, so that the output of the Aggregate is the first input rather than the second.
- Insert a Sort snap right after each output of the Copy snap. It won’t work if you put the Sort before the Copy. In this workaround, the point of the Sort snaps isn’t to sort the data, which might already be sorted – it’s to essentially create independent buffers of the data from each of the Copy snap’s output views.
- Replace the Lookup with a Join, and set the
Sorted streams
property toUnsorted
.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 08:44 AM
Below are some points that you should take about In-Memory Lookup Snap:
- The join operation within the snap will start, when the right input document stream ends. Meaning that, in your case the snap first will wait the aggregation of the data to be completed and then processed by In-Memory Lookup.
- All the right input data is loaded in memory(of the JVM) as a lookup data. So, it is possible for the Snap to cause a poor performances.
Did you have another processes that are running in parallel with this process, that are also using similar snaps(join, snaps for aggregation, group snaps etc.) inside, which have an impact of the memory?
Did you tried the same scenario using JOIN Snap?
Regards,
Spiro Taleski
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 10:57 AM
@Spiro_Taleski
Thank you for the answer.
i did try join also.
i am aware of all that. i have 6000 rows. to aggregate to join to my first result and its not even moving
its like this
if i duplicate my snaps and make the aggregate and the join to the intial snaps it will take less than a minute.
Regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 11:20 AM
here is a simple example
test copy and join_2021_09_08.slp (14.2 KB)
test - 2021-09-03T152531.018.xlsx (740.4 KB)

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 03:15 PM
Hi Wassim,
This is actually a known issue (SWAT-3096) that we’re working on a fix for. It happens when there are at least 1024 records being copied by the Copy snap, for reasons that are a bit difficult to explain.
Until we have a fix, there are at least three workarounds:
- Swap the order of the inputs to the Lookup snap, so that the output of the Aggregate is the first input rather than the second.
- Insert a Sort snap right after each output of the Copy snap. It won’t work if you put the Sort before the Copy. In this workaround, the point of the Sort snaps isn’t to sort the data, which might already be sorted – it’s to essentially create independent buffers of the data from each of the Copy snap’s output views.
- Replace the Lookup with a Join, and set the
Sorted streams
property toUnsorted
.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-09-2021 07:06 AM
thank you very much @ptaylor
