cancel
Showing results for 
Search instead for 
Did you mean: 

Interesting question on changing Json key danamically

walkerline117
Contributor

Here’s our use case.
We have 2 files

  1. XML: contains the meta data(e.g. column name, column order) information of the data file
  2. CSV: contains the actual data

Our goal:

Read the CSV file and change the column names(by default for CSV file, it will show field001, field002 etc… in Snaplogic as the key of each column) with the actual column name in the XML file.

For example, here is the xml file content:

<SchemaTable>
    <ColumnName>ftl_type_id</ColumnName>
    <ColumnOrdinal>0</ColumnOrdinal>
</SchemaTable>
<SchemaTable>
    <ColumnName>ftl_title</ColumnName>
    <ColumnOrdinal>1</ColumnOrdinal>
</SchemaTable>

Here is the csv file
1,Region
2,Country

Expect output:

{ftl_type_id:1, ftl_title:Region}
{ftl_type_id:2, ftl_title:Country}

Is there a way to do this in a single pipeline?
Thanks

1 ACCEPTED SOLUTION

tstack
Former Employee

The CSV Parser supports a seconds input view that you can use to feed it a simple schema. So, you should be able to read in the XML file with the schema and then convert it into the format that the CSV Parser expects. The following is a pipeline that does just that, you should be able to replace the Constant snaps with File Readers to make it work for you.

CSVSchema_2018_03_12.slp (8.1 KB)

Here’s the expression in the Mapper that is doing the conversion:

{}.extend($SchemaTables.SchemaTable.sort((x, y) => x.ColumnOrdinal - y.ColumnOrdinal).map(x => [x.ColumnName, "string"]))

It first sorts the SchemaTable array by the ColumnOrdinal and then uses a map() to produce an array of key/value pairs. Those pairs are then fed into the extend() method to create the output document that is turned into the CSV schema file.

The problem could be attacked in other ways as well. For example, the XML schema file could be converted into an expression library and imported into the pipeline. Then, a mapper could be added that used the mapKeys() method to lookup the column number in the schema library and mapped it to the column name.

View solution in original post

5 REPLIES 5

walkerline117
Contributor

Hi, any one can help?

chris
New Contributor

It’s not technically a single pipeline, but you could read in the XML, pass the metadata fields as parameters to a Pipeline Execute, which would then execute a second pipeline that reads in the CSV and renames the columns using the parameters.

thanks! I came up with the same solution

tstack
Former Employee

The CSV Parser supports a seconds input view that you can use to feed it a simple schema. So, you should be able to read in the XML file with the schema and then convert it into the format that the CSV Parser expects. The following is a pipeline that does just that, you should be able to replace the Constant snaps with File Readers to make it work for you.

CSVSchema_2018_03_12.slp (8.1 KB)

Here’s the expression in the Mapper that is doing the conversion:

{}.extend($SchemaTables.SchemaTable.sort((x, y) => x.ColumnOrdinal - y.ColumnOrdinal).map(x => [x.ColumnName, "string"]))

It first sorts the SchemaTable array by the ColumnOrdinal and then uses a map() to produce an array of key/value pairs. Those pairs are then fed into the extend() method to create the output document that is turned into the CSV schema file.

The problem could be attacked in other ways as well. For example, the XML schema file could be converted into an expression library and imported into the pipeline. Then, a mapper could be added that used the mapKeys() method to lookup the column number in the schema library and mapped it to the column name.