cancel
Showing results for 
Search instead for 
Did you mean: 

Multi child xml to csv help required

bourner
New Contributor

HI,

I have an xml file that has multiple child arrays within it.

I’m struggling to get it into a flat format for export to csv, could someone spread a little snap love on this and help 🙂

xml looks like this

<?xml version="1.0" encoding="utf-8"?>
	<FLEET>
		<ZMI_PM_S_OUTPUT_FLEET_LIST>
			<V_FLEET>
				<FLEET_NUM>SKBMI823EJ</FLEET_NUM>
				<EQUNR>000000000010014880</EQUNR>
				<ERDAT>2017-09-18</ERDAT>
				<ERNAM>DB21016</ERNAM>
				<EQASP>E</EQASP>
				<AEDAT>2021-10-01</AEDAT>
			</V_FLEET>
			<KSSK>
				<ZMI_PM_S_FLEET_LIST_KSSK>
					<KSSK_H>
						<OBJEK>000000000010014880</OBJEK>
						<MAFID>O</MAFID>
						<KLART>002</KLART>
						<CLINT>0000027802</CLINT>
						<ZAEHL>20</ZAEHL>
						<STATU>1</STATU>
					</KSSK_H>
					<AUSP>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_BSTNR</CHARACT>
							<VALUE>10934902</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_BSTPO</CHARACT>
							<VALUE>140</VALUE>
						</API_VAL_R>
					</AUSP>
				</ZMI_PM_S_FLEET_LIST_KSSK>
				<ZMI_PM_S_FLEET_LIST_KSSK>
					<KSSK_H>
						<OBJEK>000000000010014880</OBJEK>
						<MAFID>O</MAFID>
						<KLART>002</KLART>
						<CLINT>0000027803</CLINT>
						<ZAEHL>10</ZAEHL>
						<STATU>1</STATU>
					</KSSK_H>
					<AUSP>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_AC_MAN</CHARACT>
							<VALUE>WEBASTO ATHEMIA</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_ADVERT_FRAMES</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_CCTV</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_CURRENT_REG_DT</CHARACT>
							<VALUE>26.01.2017</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_EMISSIONS</CHARACT>
							<VALUE>16.08.2019</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_ENG_EURO_NORM</CHARACT>
							<VALUE>EURO VI</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_GPS</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_MAN_DT</CHARACT>
							<VALUE>26.01.2017</VALUE>
						</API_VAL_R>
					</AUSP>
				</ZMI_PM_S_FLEET_LIST_KSSK>
			</KSSK>
		</ZMI_PM_S_OUTPUT_FLEET_LIST>
	</FLEET>

if it helps “OBJEK” in <KSSK_H> is related to “EQUNR” in <V_FLEET>

Many thanks

10 REPLIES 10

robin
Former Employee

Could you try again to provide the XML sample (either as an attachment or using Markdown to enter a code block)

bourner
New Contributor

testxml.txt (2.1 KB)

I’ve attached the xml in txt document.

Is it also possible to rotate the ‘attributes’ (KSSK elements) so that there is only 1 or 2 rows row of data per fleet record ? (Some may not have any attributes)

Thanks

bourner
New Contributor

Don’t think I’m being overly clear here, apologies.

What you’ve done is brilliant, I think what i need to do is pivot so that the CHARACT rows become column headings in themselves, with the VALUE as the value.

So in the above image it would go

AEDAT, EQASP, EQUNR, ERDAT, ERNUM, FLEET NUM, PM_ARB_A_BSTNR, PM_ARB_A_BSTPO…

I’ve attached an example
examplecsv.txt (456 Bytes)

robin
Former Employee

Got it. Here’s the solution - definitely some advanced Expression Language usage is required but I did learn quite a bit doing it:

image
12465_xml-to-flatten-csv_2022_05_20.slp (13.6 KB)

Since you want to keep it to two rows, we drop the second JSON Splitter from my first solution above - and I’ve renamed the first Splitter to Split on KSSK:

image

So we have data that looks like this now:

image

Since you want CHARACT values like PM_ARB_A_AC_MAN to become keys, we have to do an Unpivot (i.e. making { "key": "someKey", "value": "someValue" } become { "someKey": "someValue" })

I used the following expression to do that (note: yes, this should really be a new Snap😞

$AUSP.API_VAL_R.map((elem) => { [elem.CHARACT]: elem.VALUE } )

mapped to a temporary field $unpivoted.

I also took the opportunity to flatten the structure of the other fields in the JSON in preparation for CSV:

image

What’s important here is that I moved them under a placeholder $root object. That is because I’ll want to merge contents of the $unpivoted array into $root whilst also dropping the array itself. It’s easier to do it this way rather than trying to modify the root $ document and also remove the array at the same time.

The last tricky step is the merge:

image

Again I use the Expression Language - this time the object.extend(target:data) Object function:

$root.extend(...$unpivoted)

The triple-dot is the Spread Operator/Syntax:

Spread syntax ( ... ) allows an iterable such as an array expression or string to be expanded in places where zero or more arguments (for function calls) or elements (for array literals) are expected, or an object expression to be expanded in places where zero or more key-value pairs (for object literals) are expected.

In other words, I can pass an array of objects of key/value pairs, and extend() will combine the key/value pairs into a single object. And since I am extending the $root objects, those k/v pairs will be combined with the others!

Binding the result of the expression to $ means I’ll have two flat objects ready to be used for CSV.

Finally, you need to manually specify the CSV Formatter’s Headers to use, since the documents have different fields (the default is to use the headers from the first document).

Enabling the “Use Header” checkbox and listing the fields from the input document:

image

should output your desired CSV content:

"AEDAT","EQASP","EQUNR","ERDAT","ERNAM","FLEET_NUM","CLINT","KLART","MAFID","PM_ARB_A_BSTNR","PM_ARB_A_BSTPO","PM_ARB_A_AC_MAN","PM_ARB_A_ADVERT_FRAMES","PM_ARB_A_CCTV","PM_ARB_A_CURRENT_REG_DT","PM_ARB_A_EMISSIONS","PM_ARB_A_ENG_EURO_NORM","PM_ARB_A_GPS","PM_ARB_A_MAN_DT"
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027802","002","O","10934902","140",",",",",",",","
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027803","002","O",",","WEBASTO ATHEMIA","No","No","26.01.2017","16.08.2019","EURO VI","No","26.01.2017"

bourner
New Contributor

Thank you so much Robin,

I have an issue now that when I put in the actual xml file (17MB to 34MB in size) it keeps disconnecting.

I believe its running out of memory, as I’m getting 12GB of memory allocated to this.

Is there a way to just flatten the KSSK elements and ignore the V_Fleet ones ?

thanks