Csv file with varying number of columns in each row

my csv file has several rows for each person on the file. The specific rows have different number of columns in each row. CSV file reader is having trouble with it. Is this impossible to read for this snap?

Hi @heidi.andrew,

I haven’t seen this issue thus far with File reader snap. Can you please elaborate the snaps you’re using for this?
Usually, I would go ahead with File Reader → CSV Parser → Mapper (with null-safe access checked on)
If you’re using the same snaps, can you share the settings you’re using for File Reader and CSV parser?


Here are a couple of people and their rows:
EEID|55555|xxxxxxxxx|MICHAEL|C|xxxxx|M|06/21/xxxx
OFFR|55555|01/01/2023-55|OE|01/01/2023|01/01/2023|01/01/2023|12/31/2023|02/06/2023 05:00:00.000000 AM
ELIG|55555|01/01/2023-55|CDHU02500L|BLUE BASIC|EEONLY|EMPLOYEE ONLY|Y|103.28|511.62|Y|Y|Y|Y|Y|||
COVG|55555|OE|01/01/2023|||||||01/01/2023|||||Y|06|UNKNOWN|02/06/2023 05:00:00.000000 AM|01/01/2023-55
EEID|7777777|xxxxxxxxx|KEITH|E|xxxxxxxx|M|11/11/xxxx
OFFR|7777777|xxxxxxxxx|KEITH|E|xxxxxxxx|M|11/11/xxxx
OFFR|7777777|01/01/2023-55|OE|01/01/2023|01/01/2023|01/01/2023|12/31/2023|02/06/2023 05:00:00.000000 AM
ELIG|7777777|01/01/2023-55|CDHU02500L|BLUE BASIC|EEONLY|EMPLOYEE ONLY|Y|203.28|411.62|Y|Y|Y|Y|Y|||
COVG|7777777|OE|01/01/2023|||||||01/01/2023|||||Y|06|UNKNOWN|02/06/2023 05:00:00.000000 AM|01/01/2023-55

I also played around with the ignore empty data and surrounding spaces features but that did not work for me either. there are no headings in the file but i gave it header names and thought that may help. i am joining them with another file. that does not give errors yet as i have been unable to get past that csv file reader.
image

Hi @heidi.andrew,

Can you please share a sample .csv file if you don’t mind. The snap properties look good to me however I might be overlooking something so wanted to try it on my end.

If it’s a sensitive data, you can truncate and share me a reduced version of the file.
Thanks.

Regards,
Darsh

different column numbers on rows.txt (773 Bytes)

any luck with this little file?

Hi @heidi.andrew,

Good day, hope this helps… attached poc pipeline

canvas_01_2023_02_09.slp (11.6 KB)

Thanks,
MM

1 Like

This worked! I do not understand why or how. I have not used any of the json features yet but i will study it and learn how this works. Thank you so much.

Glad to be of help

1 Like

@heidi.andrew - Apologies for the delay in getting back to you. I haven’t gotten a chance to look at the sample file you provided but seems like solution provided by @alchemiz worked. I will also test it on my end whenever I get a chance and keep you posted.

Thank you @alchemiz for your assistance.

Best Regards,
Darsh

1 Like

@heidi.andrew - here is another pipeline that provides a couple more examples how this could be done. In this pipeline, there are two examples - the first is similar to what @alchemiz has done, but with an added step to simplify the syntax of the record conversion into an array then object. The second path steps it down even farther using all basic snap functionality to split the input stream so it can be parsed separately based on the input row type.

I hope this helps!
Community 15513 - Multi-row CSV_2023_02_22.slp (24.3 KB)

1 Like

Thank you! I will check it out. i really needed a simpler pipeline to understand it and maybe yours will be just that.

I think this will work. Thanks!