02-28-2019 02:27 PM
I have a Multi File Reader reading a series of S3 files using a wildcard, and writing the data to Snowflake. There is a Mapper in between. Functionally, everything is working as expected.
I’d like to get the name of the file in which the data was read, and write it to a Meta_FileName column. How do I retrieve the file name from the Multi File Reader? I’m sort of assuming it’s an expression to be added in the Mapper but not sure. TIA!
01-18-2022 06:56 AM
Hmm, the same code it is working on my side. So, I am wondering if the file data can have some impact.
Could you try to re-drag the Join Snap(first delete existing Join, and drag new one from Snap Pallet again), and add the same configuration.
Inner Join with “1” for Left and Right Join Paths.
Regards,
Spiro Taleski
01-18-2022 07:14 AM
i did delete the join and dragged a new one to the canvas , the set up is below but still no luck with this .
01-18-2022 07:33 AM
Strange.
As a workaround, you can achieve the same by using Directory Browser Snap and Parent-Child pipeline configuration.
Parent Pipeline
Child Pipeline
This is actually what was proposed above by tstack.
t’s currently not possible to pass the binary header that contains the file name through the CSVParser. Instead, you can use a DirectoryBrowser snap to get the file names and then kick off child pipelines to read the files and do the SnowflakeUpserts. In that case, you’ll be passing the filename as a pipeline parameter to the child pipeline, so you can use a Mapper to add the parameter into the documents that are going into the Upsert. A side-benefit of this is that you can process multiple files in parallel.
Regards,
Spiro Taleski
01-24-2022 03:21 PM
thank you for responding do you think this will work if i have subfolders in my source? i tried the recommended method and i am getting an error . i believe the issue is that i am passing the child pipeline multiple file names at once , this is due to the fact that i have multiple files in a folder.
01-25-2022 12:02 AM
Directory Browser Snap should work with subfolders. Please check the documentation: https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/1438716/Directory+Browser
Regarding the error. From what I can see is that the child pipeline has more than one unlinked output view. Please check the child pipeline, and make sure that you only have 1 or no output views.
Regards,
Spiro Taleski