I have the following use-case and I haven’t found any way to get it done within SnapLogic. I am wondering if it is possible at all using this tool.
I have a CSV File in Snaplogic Designer with the following fields (example) $ID, $Name, $AmountPaid1, $AmountPaid2.
I also have an Excel File with the following fields: $ID, $AmountPaid2.
I want to alter the original CSV file by “looking up” the excel file and replacing the value of $AmountPaid2 (in the CSV) with the value of $AmountPaid2 from the excel file, only where $ID from the excel = $ID from the CSV.
Yes, developing this logic in SL is very much possible. I am attaching a sample pipeline which does exactly what you need:
Lookup_2021_10_14.slp (8.9 KB)
Thanks very much for your help @bojanvelevski . I have used the example in the sample pipeline that you sent and strangely enough, my output document is unchanged. The only difference is that I am joining on two fields instead of 1. I am stumped as to what is preventing this from working.
It would help to tell us why you’re joining on another field and which field, though it looks like $AmountPaid2 is the only other field you could join on. But including that field in the join does not make sense based on your description – you want to replace the value of $AmountPaid2 in the CSV because the $AmountPaid2 value in the Excel file is different, right? Joining only on $ID should be exactly what you need. The pipeline that @bojanvelevski provided does exactly what you asked, does it not?
Thanks for your response. I have simplified my use case to make it easier to communicate here. I am left outer joining on two fields instead of just one because for my use case, the unique identifier is made up of two fields since the output document has duplicate ids.
@bojanvelevski’s solution works in his sample but when I use the same logic with the exception of left outer joining on two fields (2 fields = the unique identifier), my output data is unchanged.
The unique identifier is called ID and COV. The excel that I am joining has matching ID and COV with the output document but it doesn’t seem to work.
I see, thanks. Can you share more details? Otherwise there’s nothing to go on to help you.
Sure, I am not sure what details would help, I will try to share as much as I can, please see below.
Output document is derived from a series of CSV files. Left outer joins are done on each file and the resulting data’s fields are manipulated at the mapper with the red streak above it. The CSV files do not have identical fields. Most of them only have a couple fields in common (the unique identifiers) and I have not padded the missing fields of each file with null columns. The ID and COV fields that I am joining onto the main output document exists inside the document stream and not actually in a file, these ID and COV fields are created inside the mapper with the red streak above it. The circled join snap and the mapper after it is where I am implementing @bojanvelevski 's logic.
Below is the configuration of the circled join snap. I need to use bojan’s logic twice which is why there are two joins and two excels.
Below is the configuration of the mapper immediately after the circled join snap. It is not currently using the logic provided by bojan, I have changed it to what you see in the below pic after observing that bojan’s logic didn’t work for me but what I have now isn’t working either. The target path contains the names of the fields that i need to replace.
Please let me know if I can share any more to help.
Yes, that is a lot more complicated than your original example. You’ve got a 3-way join, and you’re not joining on two fields, you’re joining on four! Two fields from the CFRI input and two from the CFCVG input. So are all three input streams sorted on both of the keys (TAIX-POL and TAIX-COV)?
BTW, you’ve got two snaps labelled Join2. Might help to change one so the labels are all distinct.
Also, are you able to validate this pipeline so you can preview the input and output data at each step? That can really help diagnose issues like this.
Thanks, yes you are correct; I am joining 4 fields. I have all 3 input streams sorted ascending on the keys (TAIX-POL and TAIX-COV). I will change the name of the Join2 snap as well.
I can validate the pipeline but the majority of the preview data is null, i think this is because of the multiple left outer joins that I used to derive the output document. So its hard for me to troubleshoot. Below is a screenshot of the preview output data of the “Map Replacement Values” Mapper snap (where I assign the values of the fields in the excels to the fields in the output document)
Edited (forgot to upload screenshot):
Ok, that’s the output that’s clearly not what you want. But if your join isn’t working the way you expect, what you need to study are its inputs. And maybe share them here so we can help diagnose the issue.
Thanks, thats good advice. I have looked at the very first join that I am doing and I have observed that it is returning all nulls in the preview data. I have switched it to an inner join and I have found that no preview data is being returned after validating. I am certain that there are matching IDs in the documents that I am trying to join.
When validating a Join snap set to use “Inner Join” does Snaplogic only attempt to join data of the top 50 records of each document? Or should I see a preview of 50 successfully joined records from the two documents?
Only the first 50 documents of each input view. But try changing Preview Document Count from 50 to 2000 in User Settings.
One thing you could do is instead of trying to join on those 4 different fields, you could use a mapper to create a single ‘key’ field (just concatenate all the values that make up the unique key). Do this for both sources and then do the join on that single key field. You can add a sort in their for good measure if you want, though I think joins support unsorted data. If you need to, write the output at this point just before the join into a file and make sure the keys are matching exactly between the two data sets. All it takes is one tiny difference between what you are trying to join on between your data sets for it to be off, including if a field is an integer object or a string.
Thanks everyone for your help. @bojanvelevski 's solution works. The issue I was facing was that the join was failing due to the ID’s in both files not matching.
The output document’s ID’s, derived from the CSV files were encapsulated with quotes (") where as the excel didn’t have these. Data types were both text so for example, the join was looking to match ID: 001 with ID: “001”, this would fail. Once I sorted this issue out by doing a replaceAll() function in the join snap, the solution worked flawlessly.
Thank you @ptaylor for pointing me in the right direction to check my data. In this case I triple checked it. I can’t believe that I missed that lol.
I would have expected the CSV Parser to remove those double quotes. But I’m glad you figured it out.
BTW, where/how did you put the replaceAll in the Join snap settings?