09-23-2019 06:10 AM
Good day all,
I’m getting an error when joining a large data stream (118M rows) with a smaller one (276 rows).
The join is on a field called ‘column_name’ to ‘col’, and its an Inner join.
The smaller data set contains data validation info (length, data type, etc) that I’m adding to each row of data, hence the join on the column name.
It works well with a smaller sample, but when I reach into the millions, the join snap fails.
See the error below:
Any assistance would be appreciate.
09-23-2019 11:21 AM
As you are not using sorted data, you might find that it is trying to sort the large input data. In this scenario, you might find that using the in-Memory lookup rather than the join gives you a better result, it doesn’t have to sort the input streams.
09-23-2019 11:32 AM
Thanks @cstewart , I’ll give it try after hours and see how it performs.
10-09-2019 06:43 AM
So I tried using the In-Memory Lookup, and I still got the same error.
05-25-2023 11:17 PM
So anyone found the fix for this?