File Reader + parser - assign filename

I’ve got a pipeline which is reading multiple files.

A simplified version is the below:
image

I’d need the output to contain a column with the filename of the original file.
As the file content has nothing to do with the filename, i can’t use the normal “copy / join” pattern, as the join would have to be one without condition - which would however multiply my data per input file.

CSV Parser does not seem to allow me to pass through the filename - which would in reality be the simplest and best solution.
Has anyone built something similar - or has some ideas on how to accomplish this?

Hi @matthias.voppichler ,

You can use the following example:
image

In binary router you can set the following:

In mapper you can use this:

And in the join you can set this:
image

Try it and let me know the outcome.

Regards

1 Like

Unfortunately, this solution will only work as long as you only have one filename.

The moment you get multiple file documents (which is the case for me) - i get multiple documents in the “mapper1” step - so the join with “true = true” will multiply my rows per file.

Assuming i have 3 files, with 2 lines each (6 distinct lines expected in the output) - then i would receive 6 x 3 (18) lines - as each content line will be combined with every file name.

1 Like

Hey @matthias.voppichler,

You can use Pipeline Execute. Add the CSV in a child pipeline, and pass the filename as a parameter, that way you can read the parameter and map it on every line that will come out of the CSV Parser using a Mapper. Files will be read one at a time, parsed, and the filename will be added to the lines accordingly.

You can even combine Marjan’s solution with pipeline execute and again you’ll have the same desired result.

Regards,
Bojan

2 Likes

As @bojanvelevski said, you can use this solution with pipeline execute snap in order to process one document at a time.

Regards,
Marjan

This will work, yes - i was actually hoping to find a better solution without a child pipeline - but it appears like that’s the way to go.

1 Like