06-24-2022 06:50 PM
Hi Team,
I’m using an Inner Join and observing that the output preview doesn’t appear. Is this an expected behaviour?
I however, have access to the data I’m expecting out of Inner join (as I’m writing to an excel file) but using a mapper after Inner Join doesn’t help as there is no data in the input schema of the mapper. How do I filter the data that I don’t need out of the Inner Join when the preview is not available and Input schema is showing nothing?
I also have access to the other filtered data as I have enabled the error view in Inner Join and that does have an output preview to it.
Appreciate your help and time on this.
Best Regards,
Darsh
06-27-2022 05:42 AM
That is correct - if the Mapper has no input, it can’t determine the schema.
07-07-2022 06:37 AM
I haven’t gotten a chance to test this out but it’s on my to-do list so should be completed before the end of this week and I can keep you posted @koryknick on how it goes.
Thank you for your help on this one.
Regards,
Darsh
07-22-2022 12:42 PM
@koryknick: Thank you for your patience.
I was able to test with 2 dummy files and could literally see the output preview on the Inner Join. Those dummy files had a max of 27 rows (which is very less in comparison to the data I deal with everyday)
However, with the kind of data I’m dealing with my use case, its in hundreds of thousands of records for which a large chunk of data wouldn’t suffice the condition of Inner Join → In such cases even if I select the “Preview Document Count” as 2000, the output of Inner Join wouldn’t be there.
What would you suggest me here? What else can I use instead of an Inner Join?
07-22-2022 12:47 PM
Output previews are very helpful while developing the pipeline as we can literally see the ouput of what we are developing, the workaround for this would be exporting the results in a flat file and analyzing the flat file thereafter, isn’t this time consuming and prone to errors (as there would be human intervention)?
If you were thinking of any other ways, I would request you to please share.
07-22-2022 01:41 PM
@koryknick: What if the inputs that go into an Inner Join have been sorted, would the ouput preview be randomized or would it be based on the sorted input?
I sorted both the inputs and I was expecting some values in preview but I didn’t. If output preview is purely randomized then we can disregard this scenario. Some snaps below for your reference:
Pipeline showing no output preview on Inner Join:
Snowflake Sort snap exported results for references:
Flat file Sort snap exported results for references:
Preview count is 50, if it would have been 1, I could understand nothing coming on an output preview:
Screenshot of Join settings:
Without the output preview, its challenging to go ahead with other transformations using mapper, sort, etc. What would you suggest using instead of this?
CC: @ptaylor