Hi, Anyone aware of how do we parse the excel with multisheet without even knowing sheet index / sheet name with having certain number of records in each sheet. I am having requirement to remove the email ids from first row of first sheet and rest of the sheets as is upload. Please let me know is there any other approach instead of parsing each sheet Any help is appreciated.

@rashmi - this is a fairly complicated issue for a couple reasons. First, SnapLogic does not have a snap to read multiple sheets into a single stream. Second, the Excel Multi Sheet Formatter requires pre-determined, named views as input to write multiple sheets. So if you have input files with different number and names of sheets, there is no out of the box solution for you here. This does not mean there is no hope; it’s just a more technical discussion. Excel files are basically zip archives, which you can read using the ZipFile Read snap and use a Binary Router snap to check the $['content-location'] value for the first ‘xl/worksheets/*’ reference. From here, it will take some work on your part on how you can remove that string from the content and push it back with the rest of the Excel file content into the ZipFile Write to reconstitute the file. This work is more than I can spend time to provide a working example for you here, but is an interesting problem. I hope you share your findings!

As is typical for me, I just couldn’t leave this one alone. Here is a solution I came up with based on this discussion. It simply removes the value from A1 of the first sheet. Community 13302 - Excel Remove A1 from sheet1_2022_09_01.slp (9.0 KB) Please replace the files with your own. Note that when you download, the output file may switch to .zip, so you just need to rename it so you can open it as an Excel file. If you intend to use this, please test it thoroughly as I would consider this to be a Proof of Concept solution.

Thank you so much @darshthakkar,@pmancevski

Hi @rashmi, This maybe it’s not the best solution but you could try with recursion, and you will control it from “Excel Parser” snap from error view, you will wait/loop until you receive particular error - Sheet with the provided index does not exist: Here are some links which can guide you when developing recursive pipelines: https://community.snaplogic.com/t/recursive-pipeline-solution/6202 https://community.snaplogic.com/t/pipeline-execute-maximum-recursion-calls-depth/12245 https://community.snaplogic.com/t/recursive-logic-in-snaplogic/12921 Thanks, Pero M.

Thank you @pmancevski for sharing those links. I believe that would be a lot to check on @rashmi’s end. Instead, lemme loop in the experts who can help her on this one. CC: @dmiller @koryknick @ptaylor @bojanvelevski @siwadon

Dynamically parse the excel without having sheet index or sheet name

11 Replies

koryknick
Employee
3 years ago
@rashmi - this is a fairly complicated issue for a couple reasons. First, SnapLogic does not have a snap to read multiple sheets into a single stream. Second, the Excel Multi Sheet Formatter requires pre-determined, named views as input to write multiple sheets. So if you have input files with different number and names of sheets, there is no out of the box solution for you here.

This does not mean there is no hope; it’s just a more technical discussion. Excel files are basically zip archives, which you can read using the ZipFile Read snap and use a Binary Router snap to check the
$['content-location'] value for the first ‘xl/worksheets/*’ reference. From here, it will take some work on your part on how you can remove that string from the content and push it back with the rest of the Excel file content into the ZipFile Write to reconstitute the file.

This work is more than I can spend time to provide a working example for you here, but is an interesting problem. I hope you share your findings!
koryknick
Employee
3 years ago
As is typical for me, I just couldn’t leave this one alone. Here is a solution I came up with based on this discussion. It simply removes the value from A1 of the first sheet.
Community 13302 - Excel Remove A1 from sheet1_2022_09_01.slp (9.0 KB)
Please replace the files with your own. Note that when you download, the output file may switch to .zip, so you just need to rename it so you can open it as an Excel file.

If you intend to use this, please test it thoroughly as I would consider this to be a Proof of Concept solution.
- rashmi
  New Contributor III
  3 years ago
  Thanks a lot @koryknick!! this approach is good but I was thinking is there any way to remove that row itself than keeping first row as empty. Please could you help i mean to say instead of replace fuction in mapper can we use some function to remove that row itself?
pmancevski
New Contributor III
3 years ago
Hi @rashmi,

This maybe it’s not the best solution but you could try with recursion, and you will control it from “Excel Parser” snap from error view, you will wait/loop until you receive particular error - Sheet with the provided index does not exist:

Here are some links which can guide you when developing recursive pipelines:
https://community.snaplogic.com/t/recursive-pipeline-solution/6202
https://community.snaplogic.com/t/pipeline-execute-maximum-recursion-calls-depth/12245
https://community.snaplogic.com/t/recursive-logic-in-snaplogic/12921

Thanks,
Pero M.
darshthakkar
Valued Contributor
3 years ago
Thank you @pmancevski for sharing those links. I believe that would be a lot to check on @rashmi’s end.
Instead, lemme loop in the experts who can help her on this one.

CC: @dmiller @koryknick @ptaylor @bojanvelevski @siwadon
- rashmi
  New Contributor III
  3 years ago
  Thank you so much @darshthakkar,@pmancevski
koryknick
Employee
3 years ago
@rashmi - I’ve looked into this a bit and actually removing the row is quite a bit more technical due to internal references that Excel uses within the Sheets. This level of technical complexity is outside the scope of what I can spend on a Community response. If someone else solves this, I would love to see the solution!
- rashmi
  New Contributor III
  3 years ago
  No problem @koryknick Thanks a lot for helping
koryknick
Employee
3 years ago
@rashmi - Here is an updated pipeline that will remove the first row. The pipeline itself is a little less elegant now, but the concept is pretty simple. The top path is basically the original ZipFile Read to get the elements of the Excel file, then discards the sheet1 data. The second path that starts with the File Reader is using the Excel Parser to read the sheet1 contents, discards the first row, then creates a temporary Excel file with only the sheet1 data - then we read that temporary Excel file with the ZipFile Read snap and discard everything except the sheet1 data, then merge that with the top path to create the target Excel file.

Community 13302 - Excel Remove A1 from sheet1_2022_09_07.slp (24.5 KB)

This seems to work with my testing files, but note that the Excel Formatter does not support cell formatting (bold, italics, background color, cell spanning, etc.) so you would lose that in sheet1 with this method.

A couple things to note in this example:

Temporary file uses the pipe.tmpDir built-in property - I love this feature since it references a temporary directory that is automatically deleted when the pipeline ends

The “unused” or “garbage” paths of the Binary Routers are using a Script snap (Consume input) to consume the unwanted input documents so we don’t have any open output ports in the pipeline. This is technically only required when a pipeline is used as a Triggered Task or as a child pipeline called via Pipeline Execute from another pipeline, but it is a recommended best practice to close unused open output ports in your pipeline.
lingared
New Contributor
3 years ago
@koryknick This example works great if it is to replace an existing sheet. My requirement is to add a new sheet along with existing ones.

Suppose I already have an excel file with three sheets and I am getting a stream of data which I need to put into the fourth one… is there a solution for this?
koryknick
Employee
3 years ago
If you know the names of the sheets, you can read in the Excel file using the Excel Parser snap (once for each sheet), then re-create the file using the Excel Multi-Sheet Formatter snap with 4 input views.

It will look something like this:

Forum Discussion

Dynamically parse the excel without having sheet index or sheet name

11 Replies

Recent Discussions

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?

Multipart Reader failure - 'content-type' was not found