cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Using sl.zipObject with Headerless CSV

alex_panganiban
Contributor

Hoping someone might be able to find a solution to a challenge I have. I have a headerless csv file. Iโ€™m also reading in a header array that I wish to match up with the data in the headerless csv. I donโ€™t want to add the header values to the CSV Parser snap so that I can avoid creating a custom parser for a multitude of headerless files that I might receive. Basically, hereโ€™s my situation.

โ€œheadersโ€: [โ€œemployeeIdโ€, โ€œfirstNameโ€, โ€œlastNameโ€]

The csv file looks like this after going through the CSV Parser.

โ€œfield1โ€: โ€œ1001โ€, โ€œfield2โ€: โ€œSylvesterโ€, โ€œfield3โ€: โ€œStalloneโ€
โ€œfield1โ€: โ€œ1005โ€, โ€œfield2โ€: โ€œJohnโ€, โ€œfield3โ€: โ€œWickโ€
โ€œfield1โ€: โ€œ1010โ€, โ€œfield2โ€: โ€œArnoldโ€, โ€œfield3โ€: โ€œSchwarzeneggerโ€

The results I desire should look like this:

โ€œemployeeIdโ€: โ€œ1001โ€, โ€œfirstNameโ€: โ€œSylvesterโ€, โ€œlastNameโ€: โ€œStalloneโ€
โ€œemployeeIdโ€: โ€œ1005โ€, โ€œfirstNameโ€: โ€œJohnโ€, โ€œlastNameโ€: โ€œWickโ€
โ€œemployeeIdโ€: โ€œ1010โ€, โ€œfirstNameโ€: โ€œArnoldโ€, โ€œlastNameโ€: โ€œSchwarzeneggerโ€

Iโ€™ve been playing around with sl.zipObject, but I canโ€™t seem to make it work correctly. Hints, tips, and tricks would be greatly appreciated.

Thanks, Alex

1 ACCEPTED SOLUTION

alex_panganiban
Contributor

I got my solution! @Spiro_Taleski, thank you so much for your wisdom! This is what Spiro showed me.

I applied his solution to the bigger challenge I was having and this sample pipeline does exactly what I needed it to do.

sample_ConfiguredHeadersToData_2022_11_23.slp (8.0 KB)

Thank you all!

Alex

View solution in original post

3 REPLIES 3

JensDeveloper
Contributor II

HI @alex.panganiban.guild ,

Below is my pipeline that simulates your situation. Could it be that after the csv parser you map field1, field2, and field3 to your corresponding field names?

community_pipeline_V1.0_2022_11_23.slp (6.5 KB)

Let me know if this works for you

Regards

Jens

alex_panganiban
Contributor

Thank you, Jens, for your suggestion. Itโ€™s certainly a valid solution, however, based on the needs of our application architecture, it wonโ€™t really fit in with our design pattern. We are moving away from using customized mappings, and having everything configurable instead. For this reason, I was focused on the sl.zipObject method to automatically align our data with configured headers. Hereโ€™s an example.

Say we have 2 customers that send us employee headerless data (we actually have more, which makes it even more important to develop a non-customized strategy that works for every customer). We donโ€™t enforce any strict schemas or column ordering on our customers, so they can send the data elements in any order.

For customer ABC, we have a header configuration of โ€œ[employeeId, firstName, lastName, hireDate].โ€ ABC sends their data like this, with the CSV parser results afterwards:

โ€œ1001โ€, โ€œSylvesterโ€, โ€œStalloneโ€, โ€œ2001-08-21โ€
โ€œ1010โ€, โ€œNicoleโ€, โ€œKidmanโ€, โ€œ1997-05-18โ€

โ€œfield001โ€: โ€œ1001โ€, โ€œfield002โ€: โ€œSylvesterโ€, โ€œfield003โ€: โ€œStalloneโ€, โ€œfield004โ€: โ€œ2001-08-21โ€
โ€œfield001โ€: โ€œ1010โ€, โ€œfield002โ€: โ€œNicoleโ€, โ€œfield003โ€: โ€œKidmanโ€, โ€œfield004โ€: โ€œ1997-05-18โ€

For customer XYZ, we have a header configuration of โ€œ[DOB_Date, First_Name, ID, Last_Name].โ€ XYZ sends their data like this, with the CSV parser results afterwards:

โ€œ1960-07-04โ€, โ€œBobbyโ€, โ€œ99999โ€, โ€œFischerโ€
โ€œ2005-04-27โ€, โ€œBillie Jeanโ€, โ€œ88888โ€, โ€œKingโ€

โ€œfield001โ€: โ€œ1960-07-04โ€, โ€œfield002โ€: โ€œBobbyโ€, โ€œfield003โ€: โ€œ99999โ€, โ€œfield004โ€: โ€œFischerโ€
โ€œfield001โ€: โ€œ2005-04-27โ€, โ€œfield002โ€: โ€œBillie Jeanโ€, โ€œfield003โ€: โ€œ88888โ€, โ€œfield004โ€: โ€œKingโ€

Without having a create a separate parser with embedded header names for each customer, and without having to use a custom mapper for each customer, this is what I want to achieve for each respective customer.

ABC:

โ€œemployeeIdโ€: โ€œ1001โ€, โ€œfirstNameโ€: โ€œSylvesterโ€, โ€œlastNameโ€: โ€œStalloneโ€, โ€œhireDateโ€: โ€œ2001-08-21โ€
โ€œemployeeIdโ€: โ€œ1010โ€, โ€œfirstNameโ€: โ€œNicoleโ€, โ€œlastNameโ€: โ€œKidmanโ€, โ€œhireDateโ€: โ€œ1997-05-18โ€

XYZ:

โ€œDOB_Dateโ€: โ€œ1960-07-04โ€, โ€œFirst_Nameโ€: โ€œBobbyโ€, โ€œIDโ€: โ€œ99999โ€, โ€œLast_Nameโ€: โ€œFischerโ€
โ€œDOB_Dateโ€: โ€œ2005-04-27โ€, โ€œFirst_Nameโ€: โ€œBillie Jeanโ€, โ€œIDโ€": โ€œ88888โ€, โ€œLast_Nameโ€: โ€œKingโ€

Iโ€™ve had partial success using ideas derived from Transforming JSON Data and JSON returns Column Names separately from the Rows - #5 by Garrett and using the sl.zipObject method, however, after going through the parser, my data has the field001, field002, etc. key tags on them, which is where Iโ€™m running into a wall. I feel if I could create an array of values for each row of my data, without key tags on them, then this might be the solution Iโ€™m looking for.

If I could make my post parser data transform from this:

โ€œfield001โ€: โ€œ1960-07-04โ€, โ€œfield002โ€: โ€œBobbyโ€, โ€œfield003โ€: โ€œ99999โ€, โ€œfield004โ€: โ€œFischerโ€

to this, I feel like I could achieve my goal, because once I have the data array, the sl.zipObject method should work exactly as it does in the links I referenced above.

[โ€œ1960-07-04โ€, โ€œBobbyโ€, โ€œ99999โ€, โ€œFischerโ€]

Anyways, thanks again Jens. Even though yours wasnโ€™t the exact solution I was looking for, I truly do appreciate your time and generosity in responding to my plea for help. ๐Ÿ™‚

Alex

alex_panganiban
Contributor

I got my solution! @Spiro_Taleski, thank you so much for your wisdom! This is what Spiro showed me.

I applied his solution to the bigger challenge I was having and this sample pipeline does exactly what I needed it to do.

sample_ConfiguredHeadersToData_2022_11_23.slp (8.0 KB)

Thank you all!

Alex