cancel
Showing results for 
Search instead for 
Did you mean: 

File Reader + parser - assign filename

Matthias
New Contributor III

I’ve got a pipeline which is reading multiple files.

A simplified version is the below:
image

I’d need the output to contain a column with the filename of the original file.
As the file content has nothing to do with the filename, i can’t use the normal “copy / join” pattern, as the join would have to be one without condition - which would however multiply my data per input file.

CSV Parser does not seem to allow me to pass through the filename - which would in reality be the simplest and best solution.
Has anyone built something similar - or has some ideas on how to accomplish this?

1 ACCEPTED SOLUTION

bojanvelevski
Valued Contributor

Hey @matthias.voppichler,

You can use Pipeline Execute. Add the CSV in a child pipeline, and pass the filename as a parameter, that way you can read the parameter and map it on every line that will come out of the CSV Parser using a Mapper. Files will be read one at a time, parsed, and the filename will be added to the lines accordingly.

You can even combine Marjan’s solution with pipeline execute and again you’ll have the same desired result.

Regards,
Bojan

View solution in original post

5 REPLIES 5

marjan_karafilo
Contributor

Hi @matthias.voppichler ,

You can use the following example:
image

In binary router you can set the following:
image

In mapper you can use this:
image

And in the join you can set this:
image

Try it and let me know the outcome.

Regards

Unfortunately, this solution will only work as long as you only have one filename.

The moment you get multiple file documents (which is the case for me) - i get multiple documents in the “mapper1” step - so the join with “true = true” will multiply my rows per file.

Assuming i have 3 files, with 2 lines each (6 distinct lines expected in the output) - then i would receive 6 x 3 (18) lines - as each content line will be combined with every file name.

As @bojanvelevski said, you can use this solution with pipeline execute snap in order to process one document at a time.

Regards,
Marjan

bojanvelevski
Valued Contributor

Hey @matthias.voppichler,

You can use Pipeline Execute. Add the CSV in a child pipeline, and pass the filename as a parameter, that way you can read the parameter and map it on every line that will come out of the CSV Parser using a Mapper. Files will be read one at a time, parsed, and the filename will be added to the lines accordingly.

You can even combine Marjan’s solution with pipeline execute and again you’ll have the same desired result.

Regards,
Bojan