cancel
Showing results for 
Search instead for 
Did you mean: 

Error when joining a large data stream to a small one

philliperamos
Contributor

Good day all,

I’m getting an error when joining a large data stream (118M rows) with a smaller one (276 rows).
The join is on a field called ‘column_name’ to ‘col’, and its an Inner join.
image
The smaller data set contains data validation info (length, data type, etc) that I’m adding to each row of data, hence the join on the column name.
It works well with a smaller sample, but when I reach into the millions, the join snap fails.

See the error below:
image

Any assistance would be appreciate.

4 REPLIES 4

cstewart
Former Employee

As you are not using sorted data, you might find that it is trying to sort the large input data. In this scenario, you might find that using the in-Memory lookup rather than the join gives you a better result, it doesn’t have to sort the input streams.

Thanks @cstewart , I’ll give it try after hours and see how it performs.

philliperamos
Contributor

So I tried using the In-Memory Lookup, and I still got the same error.

dd_snaplogic
New Contributor II

So anyone found the fix for this?