โ05-18-2022 09:16 AM
HI,
I have an xml file that has multiple child arrays within it.
Iโm struggling to get it into a flat format for export to csv, could someone spread a little snap love on this and help ๐
xml looks like this
<?xml version="1.0" encoding="utf-8"?> <FLEET>
<ZMI_PM_S_OUTPUT_FLEET_LIST>
<V_FLEET>
<FLEET_NUM>SKBMI823EJ</FLEET_NUM>
<EQUNR>000000000010014880</EQUNR>
<ERDAT>2017-09-18</ERDAT>
<ERNAM>DB21016</ERNAM>
<EQASP>E</EQASP>
<AEDAT>2021-10-01</AEDAT>
</V_FLEET>
<KSSK>
<ZMI_PM_S_FLEET_LIST_KSSK>
<KSSK_H>
<OBJEK>000000000010014880</OBJEK>
<MAFID>O</MAFID>
<KLART>002</KLART>
<CLINT>0000027802</CLINT>
<ZAEHL>20</ZAEHL>
<STATU>1</STATU>
</KSSK_H>
<AUSP>
<API_VAL_R>
<CHARACT>PM_ARB_A_BSTNR</CHARACT>
<VALUE>10934902</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_BSTPO</CHARACT>
<VALUE>140</VALUE>
</API_VAL_R>
</AUSP>
</ZMI_PM_S_FLEET_LIST_KSSK>
<ZMI_PM_S_FLEET_LIST_KSSK>
<KSSK_H>
<OBJEK>000000000010014880</OBJEK>
<MAFID>O</MAFID>
<KLART>002</KLART>
<CLINT>0000027803</CLINT>
<ZAEHL>10</ZAEHL>
<STATU>1</STATU>
</KSSK_H>
<AUSP>
<API_VAL_R>
<CHARACT>PM_ARB_A_AC_MAN</CHARACT>
<VALUE>WEBASTO ATHEMIA</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_ADVERT_FRAMES</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_CCTV</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_CURRENT_REG_DT</CHARACT>
<VALUE>26.01.2017</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_EMISSIONS</CHARACT>
<VALUE>16.08.2019</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_ENG_EURO_NORM</CHARACT>
<VALUE>EURO VI</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_GPS</CHARACT>
<VALUE>No</VALUE>
</API_VAL_R>
<API_VAL_R>
<CHARACT>PM_ARB_A_MAN_DT</CHARACT>
<VALUE>26.01.2017</VALUE>
</API_VAL_R>
</AUSP>
</ZMI_PM_S_FLEET_LIST_KSSK>
</KSSK>
</ZMI_PM_S_OUTPUT_FLEET_LIST>
</FLEET>
if it helps โOBJEKโ in <KSSK_H> is related to โEQUNRโ in <V_FLEET>
Many thanks
โ05-19-2022 05:31 AM
Could you try again to provide the XML sample (either as an attachment or using Markdown to enter a code block)
โ05-19-2022 06:38 AM
testxml.txt (2.1 KB)
Iโve attached the xml in txt document.
Is it also possible to rotate the โattributesโ (KSSK elements) so that there is only 1 or 2 rows row of data per fleet record ? (Some may not have any attributes)
Thanks
โ05-19-2022 07:03 AM
Donโt think Iโm being overly clear here, apologies.
What youโve done is brilliant, I think what i need to do is pivot so that the CHARACT rows become column headings in themselves, with the VALUE as the value.
So in the above image it would go
AEDAT, EQASP, EQUNR, ERDAT, ERNUM, FLEET NUM, PM_ARB_A_BSTNR, PM_ARB_A_BSTPOโฆ
Iโve attached an example
examplecsv.txt (456 Bytes)
โ05-20-2022 01:25 PM
Got it. Hereโs the solution - definitely some advanced Expression Language usage is required but I did learn quite a bit doing it:
12465_xml-to-flatten-csv_2022_05_20.slp (13.6 KB)
Since you want to keep it to two rows, we drop the second JSON Splitter from my first solution above - and Iโve renamed the first Splitter to Split on KSSK
:
So we have data that looks like this now:
Since you want CHARACT
values like PM_ARB_A_AC_MAN
to become keys, we have to do an Unpivot (i.e. making { "key": "someKey", "value": "someValue" }
become { "someKey": "someValue" }
)
I used the following expression to do that (note: yes, this should really be a new Snap๐
$AUSP.API_VAL_R.map((elem) => { [elem.CHARACT]: elem.VALUE } )
mapped to a temporary field $unpivoted
.
I also took the opportunity to flatten the structure of the other fields in the JSON in preparation for CSV:
Whatโs important here is that I moved them under a placeholder $root
object. That is because Iโll want to merge contents of the $unpivoted
array into $root
whilst also dropping the array itself. Itโs easier to do it this way rather than trying to modify the root $
document and also remove the array at the same time.
The last tricky step is the merge:
Again I use the Expression Language - this time the object.extend(target:data)
Object function:
$root.extend(...$unpivoted)
The triple-dot is the Spread Operator/Syntax:
Spread syntax (
...
) allows an iterable such as an array expression or string to be expanded in places where zero or more arguments (for function calls) or elements (for array literals) are expected, or an object expression to be expanded in places where zero or more key-value pairs (for object literals) are expected.
In other words, I can pass an array of objects of key/value pairs, and extend()
will combine the key/value pairs into a single object. And since I am extending the $root
objects, those k/v pairs will be combined with the others!
Binding the result of the expression to $
means Iโll have two flat objects ready to be used for CSV.
Finally, you need to manually specify the CSV Formatterโs Headers to use, since the documents have different fields (the default is to use the headers from the first document).
Enabling the โUse Headerโ checkbox and listing the fields from the input document:
should output your desired CSV content:
"AEDAT","EQASP","EQUNR","ERDAT","ERNAM","FLEET_NUM","CLINT","KLART","MAFID","PM_ARB_A_BSTNR","PM_ARB_A_BSTPO","PM_ARB_A_AC_MAN","PM_ARB_A_ADVERT_FRAMES","PM_ARB_A_CCTV","PM_ARB_A_CURRENT_REG_DT","PM_ARB_A_EMISSIONS","PM_ARB_A_ENG_EURO_NORM","PM_ARB_A_GPS","PM_ARB_A_MAN_DT"
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027802","002","O","10934902","140",",",",",",",","
"2021-10-01","E","000000000010014880","2017-09-18","DB21016","SKBMI823EJ","0000027803","002","O",",","WEBASTO ATHEMIA","No","No","26.01.2017","16.08.2019","EURO VI","No","26.01.2017"
โ05-24-2022 04:29 AM
Thank you so much Robin,
I have an issue now that when I put in the actual xml file (17MB to 34MB in size) it keeps disconnecting.
I believe its running out of memory, as Iโm getting 12GB of memory allocated to this.
Is there a way to just flatten the KSSK elements and ignore the V_Fleet ones ?
thanks