Output preview not available for Inner Join

Hi Team,

I’m using an Inner Join and observing that the output preview doesn’t appear. Is this an expected behaviour?
I however, have access to the data I’m expecting out of Inner join (as I’m writing to an excel file) but using a mapper after Inner Join doesn’t help as there is no data in the input schema of the mapper. How do I filter the data that I don’t need out of the Inner Join when the preview is not available and Input schema is showing nothing?

I also have access to the other filtered data as I have enabled the error view in Inner Join and that does have an output preview to it.

Appreciate your help and time on this.

Best Regards,
Darsh

Seems like using mapper after Inner Join is practically impossible. Is there a workaround?

@darshthakkar - I think you may be running into a common misunderstanding with the Data Preview functionality. By default, SnapLogic pulls only the first 50 records for preview into each snap. If the two sources you are reading from do not have any records that align to meet your join criteria with an inner join, then nothing will be output.

For example, if SourceA contains records such as CustomerId from 1 to 50, and SourceB contains records with CustomerId from 2000 - 2050, then the inner join doesn’t match anything and all records are effectively filtered.

You can try updating your User Settings / Preview Document Count as high as 2000 to see if that helps:
image

Or you can set filter conditions on your source data to ensure alignment of data that will match in your Join - in the case of the CustomId values I gave, you could set a filter condition in your database select to pull only records with CustomerId values of 1-50.

To prove if my theory is correct, you can switch your Join to a Left Outer join type and see that documents will flow through to the output view of the Join.

1 Like

Thank you @koryknick for responding to this.
I will try with 2 sample flat files and test the functionality of Inner Join. While I was reading your explanation, I felt it was right, Left Outer Join always gives me an output however that is not the case with Inner Join.

What am I going to do next would be:

  1. Prepare 2 flat files with IDs similar
  2. Use a sort before Join
  3. Use Inner Join
  4. Check the Output preview
  5. If (4) fails (highly unlikeable), go to settings and change the preview count to max

Would you suggest any other steps than the ones stated above? I can keep you posted once I test this.

Regards,
Darsh

If you don’t mind, can I ask you something?
Why would I not be able to see anything under input schema for the mapper attached to the Inner Join? Is the data not matching for the first 50 records the reason?

That is correct - if the Mapper has no input, it can’t determine the schema.

1 Like

I haven’t gotten a chance to test this out but it’s on my to-do list so should be completed before the end of this week and I can keep you posted @koryknick on how it goes.

Thank you for your help on this one.

Regards,
Darsh

@koryknick: Thank you for your patience.
I was able to test with 2 dummy files and could literally see the output preview on the Inner Join. Those dummy files had a max of 27 rows (which is very less in comparison to the data I deal with everyday)

However, with the kind of data I’m dealing with my use case, its in hundreds of thousands of records for which a large chunk of data wouldn’t suffice the condition of Inner Join → In such cases even if I select the “Preview Document Count” as 2000, the output of Inner Join wouldn’t be there.

What would you suggest me here? What else can I use instead of an Inner Join?

Output previews are very helpful while developing the pipeline as we can literally see the ouput of what we are developing, the workaround for this would be exporting the results in a flat file and analyzing the flat file thereafter, isn’t this time consuming and prone to errors (as there would be human intervention)?

If you were thinking of any other ways, I would request you to please share.

@koryknick: What if the inputs that go into an Inner Join have been sorted, would the ouput preview be randomized or would it be based on the sorted input?

I sorted both the inputs and I was expecting some values in preview but I didn’t. If output preview is purely randomized then we can disregard this scenario. Some snaps below for your reference:

Pipeline showing no output preview on Inner Join:

Snowflake Sort snap exported results for references:

Flat file Sort snap exported results for references:

Preview count is 50, if it would have been 1, I could understand nothing coming on an output preview:
image

Screenshot of Join settings:

Without the output preview, its challenging to go ahead with other transformations using mapper, sort, etc. What would you suggest using instead of this?

CC: @ptaylor

Look at the preview data for the Join’s two inputs. Can you find a pair of documents from each view that should be joined?

If not, try increasing the preview count to 2000.

Thank you @ptaylor for your quick response.
This opens up a new discussion as Sort snap’s output preview is not aligned with the exported results. I have tried validating it with Shift + Validate but haven’t found any luck. Some screenshots might help you understand this in detail:

Output preview of snowflake execute:

As the Name field is string in snowflake, used the following in the mapper:

Sorting it in ascending order so that the order matches with the flat file:
image

Output preview of Snowflake Sort:

This is amusing as if you check the exported flat file, the results will come in ascending order (snaps shared in my previous comment)

Long story short: No however I was expecting it to appear that’s the reason Sort snap was used. If results are random, I can remove the sort snap as it is an overhead to the pipeline and would impact the performance.

The results are not random.

Can you please switch the preview type of the Sort output to display JSON instead of Table? I want to see if the NAME value is a string or a number.

Sure, please find the below screenshots for your reference:

Ok, thanks. The data you’re showing ARE sorted. The Sort is just like the Join or any other snap. It’s only going to sort the data that it receives on its input. If you’re doing a validate instead of an execute and your preview doc count is 50, the Sort is only going to sort 50 records. That appears to be exactly what it’s doing.

1 Like

If the numbers in the one input view start at 1000 and the numbers in the other input view are all just three digits, I would expect no output from the Join.

1 Like

What happens if you change the Preview Document Count to 2000?

Got it but I was expecting Sort snap to change these results. I mean, I get your point based on the inputs a snap receives but it will take me some time to sink this fact.

No change as I’m dealing with a lot of data and not all would suffice the join condition (as mentioned before)