Forum Discussion
arvindnsn - I've attached 2 sample pipelines that will both solve the request. The first uses the Fixed Width Parser snap to read the file into individual documents. Then the Router snap looks at the first set of characters in the returned record to determine which path to send it for further processing. Now it simply uses the Fixed Width Formatter snap to re-create each set of data as a binary stream. This is followed by the CSV Parser snap to get the individual fields for each type of data. This allows for simple breakdown of the file into the different data types that you want to process.
The second example pipeline (v2) accomplishes the same result with a slightly more advanced methodology. Note that I'm still using the Fixed Width Parser to return a "$record" value for each line of the input file. But then I use a Mapper snap and a few advanced expressions to break down the records, allowing me to bypass the reformatting and parsing of the data streams. Here is the expression in the Mapper:
$record.split('","').toObject((val,idx,arr)=> idx == 0 ? "recType" : "field" + "%03d".sprintf(idx), (val,idx,arr)=> val.replaceAll('"',''))
There are a few things going on here so I'll break it down.
$record.split('","') simply splits the string into an array of strings, using the double-quote, comma, double-quote as the field separator. This leaves the first and last values with an extra double-quote, which we'll deal with at the end with the String.reaplceAll() call. I could have done this with the String.match() method instead, so feel free to play with that if you wish.
Once I have an array, I want to re-create a JSON object that we can use on the output. So the next method is Array.toObject() which uses two callback functions to create the field name and value. For the field name, I want to use "recType" for the first field (HEADER, SUMMARY, etc.). For the value, I can simply return the value of the array element, removing any double-quotes (i.e. the replaceAll() method).
Hope this helps!
koryknick , Thank you for multiple solutions. I tried both and the 2nd one seems to be more interesting by eliminating CSV parsers ( Realistically I will be using this methodology where the CSV file will have at least 15 to 20 different types of line items) after using the fixed width parser and use the mapper to derive the solution using expressions.
Thank you for both the solutions!!