CSV to XML with repetitive xml nodes

Hi,
I’m trying to create a pipeline which suppose to create an xml file out of a CSV file.

CSV file looks like -
|ProductCode|ProductName|Price|Country|
|A001|DellLaptop|$400|US|
|A001|DellLaptop|$500|AU|
|A001|DellLaptop|$550|EU|

XML output should be -

<?xml version="1.0" encoding="UTF-8"?> US $400 AU $500 EU $550 [XML file attached as its not coming well in the question well..] Thanks for your answer, Deepak Shaw [Product.zip|attachment](upload://7NZiA8X09vabLf1eK3iLKeLDU5f.zip) (203 Bytes)

Hi @deepak.shaw

Could you please provide the elements’ structure of the XML file, so that we may be able to gain better insight as to which transformations and actions should be undertaken to resolve your particular use-case.

Thanks,
Dimitri

1 Like

I don’t believe your XML was attached correctly - please try again

1 Like
<?xml version="1.0" encoding="UTF-8"?>
<ProductCode>
	<ProductName>
		<CostLocation>
			<country>US</country>
			<Cost>$400</Cost>
		</CostLocation>
		<CostLocation>
			<country>AU</country>
			<Cost>$500</Cost>
		</CostLocation>
		<CostLocation>
			<country>EU</country>
			<Cost>$550</Cost>
		</CostLocation>
	</ProductName>
</ProductCode>

I have provided the xml sample output

Hi Experts @robin @dimitri.hristovski, Kindly suggest a solution thanks

Hi @deepak.shaw ,

I have attached a sample pipeline and sample CSV file that I am using for processing.
In the XML Generator snap, if you click the button Edit XML, you can see some logic in there.

Please have a look and let me know if it helps you.

Regards,
Marjan
csv sample.txt (112 Bytes)
CSV_TO_XML_2022_05_18.slp (6.7 KB)

3 Likes

I just want to be clear - you want to drop the values for ProduceCode and ProductName (e.g. A001 and DellLaptop respectively) from the eventual XML, correct? The only values you want to keep are Price and Country, yes?

@marjan.karafiloski has provided a solution that leverages the Expression Language and the XML Generator and it results in the desired XML returned as a String in the output document. Thank you Marjan!

An alternative way to doing this is to use the Gate Snap with an XML Formatter.

But first…

When going from CSV via JSON to XML, it’s often useful to first experiment with JSON Generator + XML Formatter to figure out what input JSON structure equates to the desired XML.

In this case, to get XML that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode>
   <ProductName>
      <CostLocation>
         <country>US</country>
         <Cost>$400</Cost>
      </CostLocation>
      <CostLocation>
         <country>AU</country>
         <Cost>$500</Cost>
      </CostLocation>
      <CostLocation>
         <country>EU</country>
         <Cost>$550</Cost>
      </CostLocation>
   </ProductName>
</ProductCode>

you need JSON that looks like this:

[
    {
        "ProductCode": {
            "ProductName": {
                "CostLocation": [
                    {
                        "country": "US",
                        "Cost": "$400"
                    },
                    {
                        "country": "AU",
                        "Cost": "$500"
                    },
                    {
                        "country": "EU",
                        "Cost": "$550"
                    }
                ]
            }
        }
    }
]

So your goal should be to gather your stream of multiple documents in a way that allows you get to that single JSON Document above.

Breakdown:

The first Snap is a Constant to simulate the original CSV (pipe-delimited and pipe-enclosed):

image

Then a CSV Parser to convert it to Document-format and to use the pipe character as a delimiter:

image

Now a Mapper to drop all the values we don’t care about (and rename the fields we do care about):

Next a Gate Snap to gather the 3 documents into a single document:

image

[{
	"input0": [{
		"country": "US",
		"Cost": "$400"
	}, {
		"country": "AU",
		"Cost": "$500"
	}, {
		"country": "EU",
		"Cost": "$550"
	}]
}]

Now another Mapper to structure the JSON so it can be converted to the XML we want:

Finally a standard XML Formatter:

image

And then you have your desired XML in a binary output format, ready to be written to a File etc

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode>
   <ProductName>
      <CostLocation>
         <country>US</country>
         <Cost>$400</Cost>
      </CostLocation>
      <CostLocation>
         <country>AU</country>
         <Cost>$500</Cost>
      </CostLocation>
      <CostLocation>
         <country>EU</country>
         <Cost>$550</Cost>
      </CostLocation>
   </ProductName>
</ProductCode>
3 Likes

Hi @marjan.karafiloski ,
Thanks for your reply.
The output is coming like:

<?xml version='1.0' encoding='UTF-8'?>
<DocumentRoot>
    <Document>
        <Metadata>
            <asMap>
                <global>
                    <doc_id>ae803607-d70c-11ec-9afc-6bf587c0d9c9</doc_id>
                </global>
            </asMap>
            <id>ae803607-d70c-11ec-9afc-6bf587c0d9c9</id>
        </Metadata>
        <Data>
            <xml>&lt;?xml version="1.0" encoding="UTF-8"?>&lt;Products>&lt;Product>&lt;ProductCode>A001&lt;/ProductCode>&lt;ProductName>DellLaptop&lt;/ProductName>&lt;Price>$400&lt;/Price>&lt;Country>US&lt;/Country>&lt;/Product>&lt;Product>&lt;ProductCode>A001&lt;/ProductCode>&lt;ProductName>DellLaptop&lt;/ProductName>&lt;Price>$500&lt;/Price>&lt;Country>AU&lt;/Country>&lt;/Product>&lt;Product>&lt;ProductCode>A001&lt;/ProductCode>&lt;ProductName>DellLaptop&lt;/ProductName>&lt;Price>$550&lt;/Price>&lt;Country>EU&lt;/Country>&lt;/Product>&lt;/Products></xml>
            <original>
                <products>
                    <ProductCode>A001</ProductCode>
                    <ProductName>DellLaptop</ProductName>
                    <Price>$400</Price>
                    <Country>US</Country>
                </products>
                <products>
                    <ProductCode>A001</ProductCode>
                    <ProductName>DellLaptop</ProductName>
                    <Price>$500</Price>
                    <Country>AU</Country>
                </products>
                <products>
                    <ProductCode>A001</ProductCode>
                    <ProductName>DellLaptop</ProductName>
                    <Price>$550</Price>
                    <Country>EU</Country>
                </products>
            </original>
        </Data>
    </Document>
</DocumentRoot>

With some additional tags, how to remove them?
One more thing How I can introduce namespaces in output xml with the approch you have suggested. Sample xml with output is like:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
    xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
	<cbc:ProductName>
		<cac:CostLocation>
			<cbc:country>US</cbc:country>
			<cbc:Cost>$400</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>AU</cbc:country>
			<cbc:Cost>$500</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>EU</cbc:country>
			<cbc:Cost>$550</cbc:Cost>
		</cac:CostLocation>
	</cbc:ProductName>
</cbc:ProductCode>

Looking forward for your suggestions.
Regards,
Deepak Shaw

Yes @robin , Thats correct

Hi Robin,
Great solution and almost no coding involved.
Can you please help me know to introduce the namespaces into xml, the output looks like:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
    xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
	<cbc:ProductName>
		<cac:CostLocation>
			<cbc:country>US</cbc:country>
			<cbc:Cost>$400</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>AU</cbc:country>
			<cbc:Cost>$500</cbc:Cost>
		</cac:CostLocation>
		<cac:CostLocation>
			<cbc:country>EU</cbc:country>
			<cbc:Cost>$550</cbc:Cost>
		</cac:CostLocation>
	</cbc:ProductName>
</cbc:ProductCode>

Thanks,
Deepak Shaw

The Snap has a DocumentRoot value in the “Root Element” setting

You’ll use our JSON-to-XML conventions, which are basically:

{
    "root": {
        "elementName": {
            "@attributeName": "attributeValue",
            "$": "elementValue"
        },
        "repeatedElement": [
            {
                "key": {
                    "@attr": "attr1",
                    "$": "key1"
                },
                "value": "value1"
            },
            {
                "key": {
                    "@attr": "attr2",
                    "$": "key2"
                },
                "value": "value2"
            }
        ]
    }
}

That input Document into the XML Formatter will generate the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <elementName attributeName="attributeValue">elementValue</elementName>
   <repeatedElement>
      <key attr="attr1">key1</key>
      <value>value1</value>
   </repeatedElement>
   <repeatedElement>
      <key attr="attr2">key2</key>
      <value>value2</value>
   </repeatedElement>
</root>

So, for the sample I provided above, the following should give you enough information to get to your final desired XML:

[
    {
        "ProductCode": {
            "@xmlns:cac": "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2",
            "@xmlns:cbc": "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2",
            "@xmlns": "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
            "cbc:ProductName": {
                "cac:CostLocation": [
                    {
                        "cbc:country": "US",
                        "cbc:Cost": "$400"
                    },
                    {
                        "cbc:country": "AU",
                        "cbc:Cost": "$500"
                    },
                    {
                        "cbc:country": "EU",
                        "cbc:Cost": "$550"
                    }
                ]
            }
        }
    }
]

which results in:

<?xml version="1.0" encoding="UTF-8"?>
<ProductCode xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2">
   <cbc:ProductName>
      <cac:CostLocation>
         <cbc:country>US</cbc:country>
         <cbc:Cost>$400</cbc:Cost>
      </cac:CostLocation>
      <cac:CostLocation>
         <cbc:country>AU</cbc:country>
         <cbc:Cost>$500</cbc:Cost>
      </cac:CostLocation>
      <cac:CostLocation>
         <cbc:country>EU</cbc:country>
         <cbc:Cost>$550</cbc:Cost>
      </cac:CostLocation>
   </cbc:ProductName>
</ProductCode>

Go from there to finish the rest of the XML you want.

1 Like

HI @robin ,
Thanks for the reply…
The steps you have suggested is (as I understand):
JSON Generator >> XML Formatter >> File Writer

If Yes, I still don’t understand how to introduce the namespace…

Related to JSON-to-XML conventions, are you referring to this URL:Convert json to xml

If possible can have a sample pipeline?
Thanks,
Deepak.

Hi @robin,
I could able to follow your suggestions and created a sample pipeline (attached)
However, it creates xml output bit different

<?xml version='1.0' encoding='UTF-8'?>
<Invoice
    xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
    xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
    <cbc:CustomizationID>urn:cen.eu:en16931:2017#conformant#urn:fdc:peppol.eu:2017:poacc:billing:international:aunz:3.0</cbc:CustomizationID>
    <cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
    <cbc:ID>PRP100418</cbc:ID>
    <cbc:IssueDate>29/04/2022</cbc:IssueDate>
    <cbc:DueDate>6/05/2022</cbc:DueDate>
    <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
    <cbc:Note>110535 - DAF - XRef607 ICT Services - CRM Developer</cbc:Note>
    <cbc:DocumentCurrencyCode>AUD</cbc:DocumentCurrencyCode>
    <InvoiceLines>
        <LineNumber>1</LineNumber>
        <LineNumber>2</LineNumber>
        <Tax>700</Tax>
        <Tax>700</Tax>
    </InvoiceLines>
    <ABNValue>ABN 31 100 103 268</ABNValue>
</Invoice>

Can you please suggest the fix?
Csv2Xml_2022_05_19.slp (17.0 KB)

Thanks,
Deepak.

Sorry, a fix to what? I don’t understand what the issue is.

Hi @robin ,
I would like the child node should appear like :

Just a small change in your Mapper needed:

That will output JSON like:

[{
	"Invoice": {
		"@xmlns:cac": "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2",
		"@xmlns:cbc": "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2",
		"@xmlns": "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
		"cbc:CustomizationID": "urn:cen.eu:en16931:2017#conformant#urn:fdc:peppol.eu:2017:poacc:billing:international:aunz:3.0",
		"cbc:ProfileID": "urn:fdc:peppol.eu:2017:poacc:billing:01:1.0",
		"cbc:ID": "PRP100418",
		"cbc:IssueDate": "29/04/2022",
		"cbc:DueDate": "6/05/2022",
		"cbc:InvoiceTypeCode": "380",
		"cbc:Note": "110535 - DAF - XRef607 ICT Services - CRM Developer",
		"cbc:DocumentCurrencyCode": "AUD",
		"InvoiceLines": [{
			"LineNumber": "1",
			"Tax": "700"
		}, {
			"LineNumber": "2",
			"Tax": "700"
		}],
		"ABNValue": "ABN 31 100 103 268"
	}
}]

which the XML Formatter will turn into:

<?xml version="1.0" encoding="UTF-8"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2">
   <cbc:CustomizationID>urn:cen.eu:en16931:2017#conformant#urn:fdc:peppol.eu:2017:poacc:billing:international:aunz:3.0</cbc:CustomizationID>
   <cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
   <cbc:ID>PRP100418</cbc:ID>
   <cbc:IssueDate>29/04/2022</cbc:IssueDate>
   <cbc:DueDate>6/05/2022</cbc:DueDate>
   <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
   <cbc:Note>110535 - DAF - XRef607 ICT Services - CRM Developer</cbc:Note>
   <cbc:DocumentCurrencyCode>AUD</cbc:DocumentCurrencyCode>
   <InvoiceLines>
      <LineNumber>1</LineNumber>
      <Tax>700</Tax>
   </InvoiceLines>
   <InvoiceLines>
      <LineNumber>2</LineNumber>
      <Tax>700</Tax>
   </InvoiceLines>
   <ABNValue>ABN 31 100 103 268</ABNValue>
</Invoice>

which is what I believe you wanted.

1 Like

Hi @robin,
Thanks for your help… really appriciated.
Regards,
Deepak.