Forum Discussion

walkerline117's avatar
walkerline117
Contributor
7 years ago
Solved

Does the CSV formatter snap aggregate the input documents

Hi.

If we have 5 input documents for a CSV formatter snap, how many output documents I will get from the CSV formatter?

My observation now is that CSV formatter snap will group all input documents into one and output only one big CSV file.

Is there a way to change the behavior of this? In my example, if I have 5 input documents, how to config so CSV formatter snap can output 5 output documents/files?

Thanks

  • tstack's avatar
    tstack
    7 years ago

    Since the CSV Formatter currently doesn’t work with arrays, you will need to use a child pipeline that splits the array into separate docs that go into a CSV Formatter.

    I’m attaching a couple of pipelines to demonstrate this flow:

    FormatArrayToCSV_2019_02_28.slp (5.6 KB)
    WriteCSVFiles_2019_02_28.slp (5.4 KB)

    The FormatArrayToCSV pipeline is the child. It splits the “$array” property in the incoming document and then passes the results to a CSV Formatter and FileWriter. You’ll need to change the JSON Splitter to split your array property.

    The WriteCSVFiles pipeline is the parent. It just contains some mock data and feeds it into the child pipeline.

9 Replies

  • So normally, for snaps like file writer snap: it is one in one out: one document in, then one file out.

    Why the CSV formatter snap does multiple documents in, only one document output out?

  • tstack's avatar
    tstack
    Former Employee

    Yes, pretty much all formatter snaps tend to collect all of their input and write a single binary output file.

    Sorry, that’s not directly supported at the moment. To write a CSV file with a single row in it, you could create a child pipeline with the CSV Formatter and FileWriter and execute it with PipeExec.

    The most common case for a CSV file (and JSON/XML/etc…) would be for it to have multiple rows/records. So, the formatter snaps write all their inputs to a single file. Some formatters (like JSON/XML) support writing a single output document, since those formats are more likely to have cases where that is needed.

    • walkerline117's avatar
      walkerline117
      Contributor

      Even with Pipeline execute snap, i tried and it seems we cannot leverage the pass through of the pipeline execute. So i have to put everything in the pipeline parameters and pass them to the sub-pipeline.

      To me, this snap behaves strange, especially as you see its very different than any other formatter snap of files related snap.

      My thought is that if user wants to format a big CSV file with multiple rows, then like other formatter or file snap related snaps, they should construct a big documents that contains multiple rows.

      • tstack's avatar
        tstack
        Former Employee

        Hmm, can you share some of the pipelines/documents that you’re working with and what is the final result you’re trying to get?

        I’m sorry, I don’t understand why you think that. The default configuration for all formatters is to write all input documents to a single binary output. I’m attaching a pipeline that demonstrates as much:

        FormatterDemo_2019_02_27.slp (12.1 KB)

        I feel like I’m missing something here or there’s some terminology mixup that we need to clear up.

        By “big documents that contains multiple rows”, do you mean a document with an array of values where each element in the array would be a row in the output?

  • Szymon's avatar
    Szymon
    New Contributor II

    Hi @walkerline117

    To me, this snap behaves strange, especially as you see its very different than any other formatter snap of files related snap.

    Yeah, SL is not the sharpest tool in the shed. So CSV formatter collects all input documents to one.
    JSON formatter has the option, not to do it. So why CSV formatter does not, I wonder 😉

    Either way, to anyone, who struggles with this ’ great feature’. Maybe mapping to binary directly would be a solution for you. See screenshots below. Of course, you need to format your content first to particular format.
    Well, I do not lose hope, that we get SL v1.0 before I’m retired.

    cheers!

    Document->Binary

    make sure you map to $content: