cancel
Showing results for 
Search instead for 
Did you mean: 

Mapping source to target columns

Akshita
New Contributor

Hi All,

We have two input files source and target with different number of columns in both files.
We need to map the source column to target column and the sequence of columns in source file is not same as target file.

Any inputs on how to do this ?

4 REPLIES 4

ForbinCSD
Contributor

Hi, @Akshita! Welcome to the SnapLogic pipeline developer community.

To answer your question, we’ll need some clarification from you. As it stands now, there are several apparent contradictions so it’s not clear what you want to do.

First off, are the source and target really both inputs? Or is the source the input, and the target is the output? The mapper snap only takes one input view (doc stream) and produces one output view (doc stream). It changes the inputs to the outputs, thus “mapping” them.

Second off, what format are the files? CSV? Something else?

Thirdly, are there column headings or names associated with the columns? In the source file? In the target file?

In the meantime, I’ll try to give you a few things to think about:

Generally, when you read data in SnapLogic, the JSON document stream where it ends up will be attribute/value pairs and not order-dependent. So, first you’ll read the data, get it into a document stream, then transform the document stream into a set of names and values appropriate for the output. Only when you go to write the output will you actually put them in some kind of order.

When the intermediate form of the data really does require ordering, then you’ll use arrays and/or ordered lists to manage that. But that is more for handling “row 2” vs “row 20” and not for columns within the rows. Think of it just like you do columns in a table in an RDMBS – those are generally order-independent as well (even though the physical storage underneath does have an ordering).

Looking forward to your reply with the additional information!

– JB, aka “Forbin”

Hi @ForbinCSD,

Thank you for the response.

Source and target are two different input files and both are in csv format. I am attaching a sample scenario for further understanding.
In the screenshot attached - input1 is the source file which will contain column headers and data. Input 2 is the target file which will contain only column headers.
Output is the format we require.
Basically input2 is the output format, we need data from input1 in input2 file for respective matched columns.
Input1 Input2 Output columns.

try to read the source data and the output file first and then use join snap(probably merge), filter the null columns(those which are not present in both) then you can perform mapping the required fields.

Thanks

ForbinCSD
Contributor

It sounds like you’re not changing the data at all, just outputting the fields requested by Input2 and in the order they appear there.

If that’s the case, then the Mapper snap will do you no good. It’s primarily for data transformation (although it can be used to drop or inject properties on docs in the docstream).

I work primarily with SQL RDBMSes and/or externally-originating JSON streams, so I haven’t played around with the CSV snaps much in SnapLogic. I would investigate the entire suite of those, if I were going to do a lot with CSVs.

If I were personally building this right now in a hurry, though, I think the most straightforward way would be mostly done inside a Script snap.

Do you know either JavaScript, Ruby, or Python?
Those are the 3 scripting languages available in SnapLogic.

Your pipeline might look something like this:
image

You pull in your data from file input1 and it comes in as a series of JSON docs, 1 doc per row, and each doc has property names matching the column headings. You have “has headings” turned ON in the top CSV Parser to do this.

From input2 you get just the list of column names. (Turn OFF the “has headings” in the CSV parser. You don’t have data with headings; the headings are the data in this case!)

If this doesn’t get you the columns from input2 in the order they occur, you may have to forego the 2nd CSV Parser and use a “Binary to Document” snap instead. You might have to play around with this to get the list of columns into the docstream in a way that you can know for certain the correct ordering that was wanted.

At the point you start writing the script, it should be pretty easy from that point on. Grab the list of output fields (and their ordering) from wherever in the doc stream you’ve put them. Then, for each document containing a data row from the input1 file, copy only the requested properties to the output, in the order requested.

Hope that helps!