Multi child xml to csv help required

HI,

I have an xml file that has multiple child arrays within it.

I’m struggling to get it into a flat format for export to csv, could someone spread a little snap love on this and help :slight_smile:

xml looks like this

<?xml version="1.0" encoding="utf-8"?>
	<FLEET>
		<ZMI_PM_S_OUTPUT_FLEET_LIST>
			<V_FLEET>
				<FLEET_NUM>SKBMI823EJ</FLEET_NUM>
				<EQUNR>000000000010014880</EQUNR>
				<ERDAT>2017-09-18</ERDAT>
				<ERNAM>DB21016</ERNAM>
				<EQASP>E</EQASP>
				<AEDAT>2021-10-01</AEDAT>
			</V_FLEET>
			<KSSK>
				<ZMI_PM_S_FLEET_LIST_KSSK>
					<KSSK_H>
						<OBJEK>000000000010014880</OBJEK>
						<MAFID>O</MAFID>
						<KLART>002</KLART>
						<CLINT>0000027802</CLINT>
						<ZAEHL>20</ZAEHL>
						<STATU>1</STATU>
					</KSSK_H>
					<AUSP>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_BSTNR</CHARACT>
							<VALUE>10934902</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_BSTPO</CHARACT>
							<VALUE>140</VALUE>
						</API_VAL_R>
					</AUSP>
				</ZMI_PM_S_FLEET_LIST_KSSK>
				<ZMI_PM_S_FLEET_LIST_KSSK>
					<KSSK_H>
						<OBJEK>000000000010014880</OBJEK>
						<MAFID>O</MAFID>
						<KLART>002</KLART>
						<CLINT>0000027803</CLINT>
						<ZAEHL>10</ZAEHL>
						<STATU>1</STATU>
					</KSSK_H>
					<AUSP>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_AC_MAN</CHARACT>
							<VALUE>WEBASTO ATHEMIA</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_ADVERT_FRAMES</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_CCTV</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_CURRENT_REG_DT</CHARACT>
							<VALUE>26.01.2017</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_EMISSIONS</CHARACT>
							<VALUE>16.08.2019</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_ENG_EURO_NORM</CHARACT>
							<VALUE>EURO VI</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_GPS</CHARACT>
							<VALUE>No</VALUE>
						</API_VAL_R>
						<API_VAL_R>
							<CHARACT>PM_ARB_A_MAN_DT</CHARACT>
							<VALUE>26.01.2017</VALUE>
						</API_VAL_R>
					</AUSP>
				</ZMI_PM_S_FLEET_LIST_KSSK>
			</KSSK>
		</ZMI_PM_S_OUTPUT_FLEET_LIST>
	</FLEET>

if it helps “OBJEK” in <KSSK_H> is related to “EQUNR” in <V_FLEET>

Many thanks

Hi @bourner,

Did you try using a JSON splitter already?

Hi Yes,

I’ve tried a JSON splitter.

I’ve tried to flatten both parts and then join them together. The V_Fleet xml element/objects are easy enough,

the KSSK part I’m unable to flatten

Thanks

Try this pipeline (I don’t know what kind of CSV you want but it may be a start).

12465_xml-to-flatten-csv_2022_05_19.slp (12.3 KB)

I’m using two JSON Splitters here:

and

The Mapper has basic flattening rules:

You’ll end up with an output that looks like this:

1 Like

That’s great,

I have a slight problem in that the actual xml also had some additional levels

<asx:abap version=“1.0” xmlns:asx=“ABAP Development”>
asx:values

Example xml here

</asx:values>

</asx:abap>

As soon as I use this xml it fails. Could you possibly point me in the right direction, and possibly explain what it’s doing ?

Apologies, I’m new to SL.

Thanks

Could you try again to provide the XML sample (either as an attachment or using Markdown to enter a code block)

testxml.txt (2.1 KB)

I’ve attached the xml in txt document.

Is it also possible to rotate the ‘attributes’ (KSSK elements) so that there is only 1 or 2 rows row of data per fleet record ? (Some may not have any attributes)

Thanks

Don’t think I’m being overly clear here, apologies.

What you’ve done is brilliant, I think what i need to do is pivot so that the CHARACT rows become column headings in themselves, with the VALUE as the value.

So in the above image it would go

AEDAT, EQASP, EQUNR, ERDAT, ERNUM, FLEET NUM, PM_ARB_A_BSTNR, PM_ARB_A_BSTPO…

I’ve attached an example
examplecsv.txt (456 Bytes)

Got it. Here’s the solution - definitely some advanced Expression Language usage is required but I did learn quite a bit doing it:


12465_xml-to-flatten-csv_2022_05_20.slp (13.6 KB)

Since you want to keep it to two rows, we drop the second JSON Splitter from my first solution above - and I’ve renamed the first Splitter to Split on KSSK:

So we have data that looks like this now:

Since you want CHARACT values like PM_ARB_A_AC_MAN to become keys, we have to do an Unpivot (i.e. making { "key": "someKey", "value": "someValue" } become { "someKey": "someValue" })

I used the following expression to do that (note: yes, this should really be a new Snap):

$AUSP.API_VAL_R.map((elem) => { [elem.CHARACT]: elem.VALUE } )

mapped to a temporary field $unpivoted.

I also took the opportunity to flatten the structure of the other fields in the JSON in preparation for CSV:

What’s important here is that I moved them under a placeholder $root object. That is because I’ll want to merge contents of the $unpivoted array into $root whilst also dropping the array itself. It’s easier to do it this way rather than trying to modify the root $ document and also remove the array at the same time.

The last tricky step is the merge:

Again I use the Expression Language - this time the object.extend(target:data) Object function:

$root.extend(...$unpivoted)

The triple-dot is the Spread Operator/Syntax:

Spread syntax ( ... ) allows an iterable such as an array expression or string to be expanded in places where zero or more arguments (for function calls) or elements (for array literals) are expected, or an object expression to be expanded in places where zero or more key-value pairs (for object literals) are expected.

In other words, I can pass an array of objects of key/value pairs, and extend() will combine the key/value pairs into a single object. And since I am extending the $root objects, those k/v pairs will be combined with the others!

Binding the result of the expression to $ means I’ll have two flat objects ready to be used for CSV.

Finally, you need to manually specify the CSV Formatter’s Headers to use, since the documents have different fields (the default is to use the headers from the first document).

Enabling the “Use Header” checkbox and listing the fields from the input document:

should output your desired CSV content:

"AEDAT","EQASP","EQUNR","ERDAT","ERNAM","FLEET_NUM","CLINT","KLART","MAFID","PM_ARB_A_BSTNR","PM_ARB_A_BSTPO","PM_ARB_A_AC_MAN","PM_ARB_A_ADVERT_FRAMES","PM_ARB_A_CCTV","PM_ARB_A_CURRENT_REG_DT","PM_ARB_A_EMISSIONS","PM_ARB_A_ENG_EURO_NORM","PM_ARB_A_GPS","PM_ARB_A_MAN_DT"
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027802","002","O","10934902","140",",",",",",",","
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027803","002","O",",","WEBASTO ATHEMIA","No","No","26.01.2017","16.08.2019","EURO VI","No","26.01.2017"
1 Like

Thank you so much Robin,

I have an issue now that when I put in the actual xml file (17MB to 34MB in size) it keeps disconnecting.

I believe its running out of memory, as I’m getting 12GB of memory allocated to this.

Is there a way to just flatten the KSSK elements and ignore the V_Fleet ones ?

thanks

Is there a way to just flatten the KSSK elements and ignore the V_Fleet ones ?

You can delete the V_FLEET entry in the JSON Splitter’s “Include Paths” setting and then update the rest of the pipeline where those fields were being referenced.