philliperamos
6 years agoContributor
Error when joining a large data stream to a small one
Good day all,
I’m getting an error when joining a large data stream (118M rows) with a smaller one (276 rows).
The join is on a field called ‘column_name’ to ‘col’, and its an Inner join.
The smaller data set contains data validation info (length, data type, etc) that I’m adding to each row of data, hence the join on the column name.
It works well with a smaller sample, but when I reach into the millions, the join snap fails.
See the error below:
Any assistance would be appreciate.