05-18-2022 09:16 AM
HI,
I have an xml file that has multiple child arrays within it.
I’m struggling to get it into a flat format for export to csv, could someone spread a little snap love on this and help 🙂
xml looks like this
<?xml version="1.0" encoding="utf-8"?> <FLEET>
<ZMI_PM_S_OUTPUT_FLEET_LIST>
<V_FLEET>
<FLEET_NUM>SKBMI823EJ</FLEET_NUM>
<EQUNR>000000000010014880</EQUNR>
<ERDAT>2017-09-18</ERDAT>
<ERNAM>DB21016</ERNAM>
<EQASP>E</EQASP>
<AEDAT>2021-10-01</AEDAT>
</V_FLEET>
<KSSK>
<ZMI_PM_S_FLEET_LIST_KSSK>
<KSSK_H>
<OBJEK>000000000010014880</OBJEK>
<MAFID>O</MAFID>
<KLART>002</KLART>
<CLINT>0000027802</CLINT>
<ZAEHL>20</ZAEHL>
<STATU>1</STATU>
</KSSK_H>
<AUSP>
<API_VAL_R>
<CHARACT>PM_ARB_A_BSTNR</CHARACT>
<VALUE>10934902</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_BSTPO</CHARACT>
<VALUE>140</VALUE>
</API_VAL_R>
</AUSP>
</ZMI_PM_S_FLEET_LIST_KSSK>
<ZMI_PM_S_FLEET_LIST_KSSK>
<KSSK_H>
<OBJEK>000000000010014880</OBJEK>
<MAFID>O</MAFID>
<KLART>002</KLART>
<CLINT>0000027803</CLINT>
<ZAEHL>10</ZAEHL>
<STATU>1</STATU>
</KSSK_H>
<AUSP>
<API_VAL_R>
<CHARACT>PM_ARB_A_AC_MAN</CHARACT>
<VALUE>WEBASTO ATHEMIA</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_ADVERT_FRAMES</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_CCTV</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_CURRENT_REG_DT</CHARACT>
<VALUE>26.01.2017</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_EMISSIONS</CHARACT>
<VALUE>16.08.2019</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_ENG_EURO_NORM</CHARACT>
<VALUE>EURO VI</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_GPS</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_MAN_DT</CHARACT>
<VALUE>26.01.2017</VALUE>
</API_VAL_R>
</AUSP>
</ZMI_PM_S_FLEET_LIST_KSSK>
</KSSK>
</ZMI_PM_S_OUTPUT_FLEET_LIST>
</FLEET>
if it helps “OBJEK” in <KSSK_H> is related to “EQUNR” in <V_FLEET>
Many thanks
05-19-2022 05:31 AM
Could you try again to provide the XML sample (either as an attachment or using Markdown to enter a code block)
05-19-2022 06:38 AM
testxml.txt (2.1 KB)
I’ve attached the xml in txt document.
Is it also possible to rotate the ‘attributes’ (KSSK elements) so that there is only 1 or 2 rows row of data per fleet record ? (Some may not have any attributes)
Thanks
05-19-2022 07:03 AM
Don’t think I’m being overly clear here, apologies.
What you’ve done is brilliant, I think what i need to do is pivot so that the CHARACT rows become column headings in themselves, with the VALUE as the value.
So in the above image it would go
AEDAT, EQASP, EQUNR, ERDAT, ERNUM, FLEET NUM, PM_ARB_A_BSTNR, PM_ARB_A_BSTPO…
I’ve attached an example
examplecsv.txt (456 Bytes)
05-20-2022 01:25 PM
Got it. Here’s the solution - definitely some advanced Expression Language usage is required but I did learn quite a bit doing it:
12465_xml-to-flatten-csv_2022_05_20.slp (13.6 KB)
Since you want to keep it to two rows, we drop the second JSON Splitter from my first solution above - and I’ve renamed the first Splitter to Split on KSSK
:
So we have data that looks like this now:
Since you want CHARACT
values like PM_ARB_A_AC_MAN
to become keys, we have to do an Unpivot (i.e. making { "key": "someKey", "value": "someValue" }
become { "someKey": "someValue" }
)
I used the following expression to do that (note: yes, this should really be a new Snap😞
$AUSP.API_VAL_R.map((elem) => { [elem.CHARACT]: elem.VALUE } )
mapped to a temporary field $unpivoted
.
I also took the opportunity to flatten the structure of the other fields in the JSON in preparation for CSV:
What’s important here is that I moved them under a placeholder $root
object. That is because I’ll want to merge contents of the $unpivoted
array into $root
whilst also dropping the array itself. It’s easier to do it this way rather than trying to modify the root $
document and also remove the array at the same time.
The last tricky step is the merge:
Again I use the Expression Language - this time the object.extend(target:data)
Object function:
$root.extend(...$unpivoted)
The triple-dot is the Spread Operator/Syntax:
Spread syntax (
...
) allows an iterable such as an array expression or string to be expanded in places where zero or more arguments (for function calls) or elements (for array literals) are expected, or an object expression to be expanded in places where zero or more key-value pairs (for object literals) are expected.
In other words, I can pass an array of objects of key/value pairs, and extend()
will combine the key/value pairs into a single object. And since I am extending the $root
objects, those k/v pairs will be combined with the others!
Binding the result of the expression to $
means I’ll have two flat objects ready to be used for CSV.
Finally, you need to manually specify the CSV Formatter’s Headers to use, since the documents have different fields (the default is to use the headers from the first document).
Enabling the “Use Header” checkbox and listing the fields from the input document:
should output your desired CSV content:
"AEDAT","EQASP","EQUNR","ERDAT","ERNAM","FLEET_NUM","CLINT","KLART","MAFID","PM_ARB_A_BSTNR","PM_ARB_A_BSTPO","PM_ARB_A_AC_MAN","PM_ARB_A_ADVERT_FRAMES","PM_ARB_A_CCTV","PM_ARB_A_CURRENT_REG_DT","PM_ARB_A_EMISSIONS","PM_ARB_A_ENG_EURO_NORM","PM_ARB_A_GPS","PM_ARB_A_MAN_DT"
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027802","002","O","10934902","140",",",",",",",","
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027803","002","O",",","WEBASTO ATHEMIA","No","No","26.01.2017","16.08.2019","EURO VI","No","26.01.2017"
05-24-2022 04:29 AM
Thank you so much Robin,
I have an issue now that when I put in the actual xml file (17MB to 34MB in size) it keeps disconnecting.
I believe its running out of memory, as I’m getting 12GB of memory allocated to this.
Is there a way to just flatten the KSSK elements and ignore the V_Fleet ones ?
thanks