Forum Discussion
Thanks for your response @bojanvelevski.
I am trying to test and understand this sample pipeline. Can you kindly give me a high-level explanation of it? I don’t recognize the code in the mapper snap and its doing a lot as seen in the below pipeline statistic (CPU% and memory).
The [Binary to Document] and [Mapper] snaps is a manual replacement for [CSV Parser] snap? Where can I find documentation for the syntax used?
Hey @JPaps,
I’m using the Binary to Document snap to get a stringified version of the CSV where rows (records) are separated by /r/n. That’s why in the expression, I’m first splitting the string on ‘/r/n’, that way I can get the headers + the records separately. The rest is a logic to fill the missing headers with ‘field_’ + random number, which in your case might be a problem due to high number of records. Anyway, the expression is creating objects out of the Headers and the Records. But because sl.zipObject is automatically adding nulls if there’s no value (but in our case we need replacement for the missing header, not value) i created the object with key/value reversed, meaning:
{"value":"key"}
If there’s a missing key, than the expression will automatically add null, which is later replaced with ‘field_’+random 2 digit number. And after filling the missing value (soon to be key), I’m reversing the object :
{"key":"value"}
It’s rough, and for smaller number of records is working, but it needs to be tested out with large number of records like in your case.
Regards,
Bojan
Related Content
- 6 months ago
- 12 months ago
- 11 months ago
- 4 years ago