SnapLogic - Integration Nation

rashmi · ‎09-01-2022

Hi,
Anyone aware of how do we parse the excel with multisheet without even knowing sheet index / sheet name with having certain number of records in each sheet.
I am having requirement to remove the email ids from first row of first sheet and rest of the sheets as is upload. Please let me know is there any other approach instead of parsing each sheet

Any help is appreciated.

koryknick · ‎09-01-2022

As is typical for me, I just couldn’t leave this one alone. Here is a solution I came up with based on this discussion. It simply removes the value from A1 of the first sheet.
Community 13302 - Excel Remove A1 from sheet1_2022_09_01.slp (9.0 KB)
Please replace the files with your own. Note that when you download, the output file may switch to .zip, so you just need to rename it so you can open it as an Excel file.

If you intend to use this, please test it thoroughly as I would consider this to be a Proof of Concept solution.

rashmi · ‎09-02-2022

Thanks a lot @koryknick!! this approach is good but I was thinking is there any way to remove that row itself than keeping first row as empty. Please could you help i mean to say instead of replace fuction in mapper can we use some function to remove that row itself?

koryknick · ‎09-06-2022

@rashmi - I’ve looked into this a bit and actually removing the row is quite a bit more technical due to internal references that Excel uses within the Sheets. This level of technical complexity is outside the scope of what I can spend on a Community response. If someone else solves this, I would love to see the solution!

rashmi · ‎09-06-2022

No problem @koryknick Thanks a lot for helping

koryknick · ‎09-07-2022

@rashmi - Here is an updated pipeline that will remove the first row. The pipeline itself is a little less elegant now, but the concept is pretty simple. The top path is basically the original ZipFile Read to get the elements of the Excel file, then discards the sheet1 data. The second path that starts with the File Reader is using the Excel Parser to read the sheet1 contents, discards the first row, then creates a temporary Excel file with only the sheet1 data - then we read that temporary Excel file with the ZipFile Read snap and discard everything except the sheet1 data, then merge that with the top path to create the target Excel file.

Community 13302 - Excel Remove A1 from sheet1_2022_09_07.slp (24.5 KB)

This seems to work with my testing files, but note that the Excel Formatter does not support cell formatting (bold, italics, background color, cell spanning, etc.) so you would lose that in sheet1 with this method.

A couple things to note in this example:

Temporary file uses the pipe.tmpDir built-in property - I love this feature since it references a temporary directory that is automatically deleted when the pipeline ends
The “unused” or “garbage” paths of the Binary Routers are using a Script snap (Consume input) to consume the unwanted input documents so we don’t have any open output ports in the pipeline. This is technically only required when a pipeline is used as a Triggered Task or as a child pipeline called via Pipeline Execute from another pipeline, but it is a recommended best practice to close unused open output ports in your pipeline.

SnapLogic - Integration Nation

Dynamically parse the excel without having sheet index or sheet name