cancel
Showing results for 
Search instead for 
Did you mean: 

CSV to XML with repetitive xml nodes

deepak_shaw
Contributor

Hi,
I’m trying to create a pipeline which suppose to create an xml file out of a CSV file.

CSV file looks like -
|ProductCode|ProductName|Price|Country|
|A001|DellLaptop|$400|US|
|A001|DellLaptop|$500|AU|
|A001|DellLaptop|$550|EU|

XML output should be -

<?xml version="1.0" encoding="UTF-8"?> US $400 AU $500 EU $550 [XML file attached as its not coming well in the question well..] Thanks for your answer, Deepak Shaw [Product.zip|attachment](upload://7NZiA8X09vabLf1eK3iLKeLDU5f.zip) (203 Bytes)
1 ACCEPTED SOLUTION

Just a small change in your Mapper needed:

image

That will output JSON like:

[{
	"Invoice": {
		"@xmlns:cac": "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2",
		"@xmlns:cbc": "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2",
		"@xmlns": "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
		"cbc:CustomizationID": "urn:cen.eu:en16931:2017#conformant#urn:fdc:peppol.eu:2017:poacc:billing:international:aunz:3.0",
		"cbc:ProfileID": "urn:fdc:peppol.eu:2017:poacc:billing:01:1.0",
		"cbc:ID": "PRP100418",
		"cbc:IssueDate": "29/04/2022",
		"cbc:DueDate": "6/05/2022",
		"cbc:InvoiceTypeCode": "380",
		"cbc:Note": "110535 - DAF - XRef607 ICT Services - CRM Developer",
		"cbc:DocumentCurrencyCode": "AUD",
		"InvoiceLines": [{
			"LineNumber": "1",
			"Tax": "700"
		}, {
			"LineNumber": "2",
			"Tax": "700"
		}],
		"ABNValue": "ABN 31 100 103 268"
	}
}]

which the XML Formatter will turn into:

<?xml version="1.0" encoding="UTF-8"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2">
   <cbc:CustomizationID>urn:cen.eu:en16931:2017#conformant#urn:fdc:peppol.eu:2017:poacc:billing:international:aunz:3.0</cbc:CustomizationID>
   <cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
   <cbc:ID>PRP100418</cbc:ID>
   <cbc:IssueDate>29/04/2022</cbc:IssueDate>
   <cbc:DueDate>6/05/2022</cbc:DueDate>
   <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
   <cbc:Note>110535 - DAF - XRef607 ICT Services - CRM Developer</cbc:Note>
   <cbc:DocumentCurrencyCode>AUD</cbc:DocumentCurrencyCode>
   <InvoiceLines>
      <LineNumber>1</LineNumber>
      <Tax>700</Tax>
   </InvoiceLines>
   <InvoiceLines>
      <LineNumber>2</LineNumber>
      <Tax>700</Tax>
   </InvoiceLines>
   <ABNValue>ABN 31 100 103 268</ABNValue>
</Invoice>

which is what I believe you wanted.

View solution in original post

19 REPLIES 19

The Snap has a DocumentRoot value in the “Root Element” setting

robin
Former Employee

@marjan.karafiloski has provided a solution that leverages the Expression Language and the XML Generator and it results in the desired XML returned as a String in the output document. Thank you Marjan!

An alternative way to doing this is to use the Gate Snap with an XML Formatter.

image

But first…

When going from CSV via JSON to XML, it’s often useful to first experiment with JSON Generator + XML Formatter to figure out what input JSON structure equates to the desired XML.

In this case, to get XML that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode>
   <ProductName>
      <CostLocation>
         <country>US</country>
         <Cost>$400</Cost>
      </CostLocation>
      <CostLocation>
         <country>AU</country>
         <Cost>$500</Cost>
      </CostLocation>
      <CostLocation>
         <country>EU</country>
         <Cost>$550</Cost>
      </CostLocation>
   </ProductName>
</ProductCode>

you need JSON that looks like this:

[
    {
        "ProductCode": {
            "ProductName": {
                "CostLocation": [
                    {
                        "country": "US",
                        "Cost": "$400"
                    },
                    {
                        "country": "AU",
                        "Cost": "$500"
                    },
                    {
                        "country": "EU",
                        "Cost": "$550"
                    }
                ]
            }
        }
    }
]

So your goal should be to gather your stream of multiple documents in a way that allows you get to that single JSON Document above.

Breakdown:

The first Snap is a Constant to simulate the original CSV (pipe-delimited and pipe-enclosed):

image

Then a CSV Parser to convert it to Document-format and to use the pipe character as a delimiter:

image

Now a Mapper to drop all the values we don’t care about (and rename the fields we do care about):

image

Next a Gate Snap to gather the 3 documents into a single document:

image

[{
	"input0": [{
		"country": "US",
		"Cost": "$400"
	}, {
		"country": "AU",
		"Cost": "$500"
	}, {
		"country": "EU",
		"Cost": "$550"
	}]
}]

Now another Mapper to structure the JSON so it can be converted to the XML we want:

image

Finally a standard XML Formatter:

image

And then you have your desired XML in a binary output format, ready to be written to a File etc

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode>
   <ProductName>
      <CostLocation>
         <country>US</country>
         <Cost>$400</Cost>
      </CostLocation>
      <CostLocation>
         <country>AU</country>
         <Cost>$500</Cost>
      </CostLocation>
      <CostLocation>
         <country>EU</country>
         <Cost>$550</Cost>
      </CostLocation>
   </ProductName>
</ProductCode>

Hi Robin,
Great solution and almost no coding involved.
Can you please help me know to introduce the namespaces into xml, the output looks like:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
    xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
	<cbc:ProductName>
		<cac:CostLocation>
			<cbc:country>US</cbc:country>
			<cbc:Cost>$400</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>AU</cbc:country>
			<cbc:Cost>$500</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>EU</cbc:country>
			<cbc:Cost>$550</cbc:Cost>
		</cac:CostLocation>
	</cbc:ProductName>
</cbc:ProductCode>

Thanks,
Deepak Shaw

You’ll use our JSON-to-XML conventions, which are basically:

{
    "root": {
        "elementName": {
            "@attributeName": "attributeValue",
            "$": "elementValue"
        },
        "repeatedElement": [
            {
                "key": {
                    "@attr": "attr1",
                    "$": "key1"
                },
                "value": "value1"
            },
            {
                "key": {
                    "@attr": "attr2",
                    "$": "key2"
                },
                "value": "value2"
            }
        ]
    }
}

That input Document into the XML Formatter will generate the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <elementName attributeName="attributeValue">elementValue</elementName>
   <repeatedElement>
      <key attr="attr1">key1</key>
      <value>value1</value>
   </repeatedElement>
   <repeatedElement>
      <key attr="attr2">key2</key>
      <value>value2</value>
   </repeatedElement>
</root>

So, for the sample I provided above, the following should give you enough information to get to your final desired XML:

[
    {
        "ProductCode": {
            "@xmlns:cac": "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2",
            "@xmlns:cbc": "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2",
            "@xmlns": "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
            "cbc:ProductName": {
                "cac:CostLocation": [
                    {
                        "cbc:country": "US",
                        "cbc:Cost": "$400"
                    },
                    {
                        "cbc:country": "AU",
                        "cbc:Cost": "$500"
                    },
                    {
                        "cbc:country": "EU",
                        "cbc:Cost": "$550"
                    }
                ]
            }
        }
    }
]

which results in:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2">
   <cbc:ProductName>
      <cac:CostLocation>
         <cbc:country>US</cbc:country>
         <cbc:Cost>$400</cbc:Cost>
      </cac:CostLocation>
      <cac:CostLocation>
         <cbc:country>AU</cbc:country>
         <cbc:Cost>$500</cbc:Cost>
      </cac:CostLocation>
      <cac:CostLocation>
         <cbc:country>EU</cbc:country>
         <cbc:Cost>$550</cbc:Cost>
      </cac:CostLocation>
   </cbc:ProductName>
</ProductCode>

Go from there to finish the rest of the XML you want.

HI @robin ,
Thanks for the reply…
The steps you have suggested is (as I understand):
JSON Generator >> XML Formatter >> File Writer

If Yes, I still don’t understand how to introduce the namespace…

Related to JSON-to-XML conventions, are you referring to this URL:Convert json to xml

If possible can have a sample pipeline?
Thanks,
Deepak.