Forum Discussion

jfpelletier's avatar
jfpelletier
Contributor
2 years ago
Solved

Splitting zip files

Hello all, As part of my pipeline I'm downloading a .zip file from a server (using the HTTP Client snap) and the zip file that I'm getting contains documents in different languages that are stored u...
  • koryknick's avatar
    2 years ago

    jfpelletier - Attached is a much more generic solution that uses the execution node tmp space as a cache to unzip the files temporarily to allow for re-zipping into a file per language.  There are 2 pipelines: a parent to split the input zipfile into individual files in the tmp mount, and a child pipeline that is called to re-zip the files for each language that was contained in the input zip.

    The parent pipeline reads the zipfile, writes each to tmp, waits for the entire zip to be split out (using Tail snap), gets the list of language directories (using Directory Browser snap) that were created, and calls the child (using Pipeline Execute snap) to rewrite the individual zip files per language.

    Note that the tmp files are written using the pipe.tmpDir built-in value.  This directory is used as scratch space of the executing pipeline and only exists as long as the pipeline is running - all contents are automatically purged as soon as the pipeline ends (either success or failure).

    The child pipeline gets the list of files in the language directory, reads the file content, uses the Mapper snap to change the "content-location" value to recreate the file path, then creates a single zipfile with all of the files associated with that language.

    Note that the Mapper snap in this pipeline was switched to use Binary input and output views.  In this context, the Mapper is only going to affect the binary header values, not the actual file content.

    Note that the syntax used in these pipelines to refer to the "tmp" directory location may not work as-is on a snaplex running under Windows.  It was developed and tested on a snaplex running Linux.

    Hope this helps!