06-06-2024 06:00 PM
I am reading a CSV/XLSX file (could be an sql table as well) with the follwing structure (file attached below as well):
Branch | Year | Type | ParentNumber | CurrentNumber | Structure | Level | Language1 | Language1_Value | Language2 | Language2_Value | CreationDate | LastUpdate |
1234 | 2024 | AAA | NULL | 12342024010000 | 01 | 1 | NL | voornaam | EN | first name | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | 12342024010000 | 12342024010100 | 01.01 | 2 | NL | achternaam | EN | last name | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | NULL | 12342024020000 | 02 | 1 | NL | geboortedatum | EN | birthday | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | NULL | 12342024030000 | 03 | 1 | NL | geslacht | EN | gender | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | 12342024030000 | 12342024030100 | 03.01 | 2 | NL | opslaan | EN | save | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | 12342024030100 | 12342024030101 | 03.01.01 | 3 | NL | versturen | EN | send | 05/06/2024 | 05/06/2024 |
1234 | 2024 | AAA | 12342024030100 | 12342024030102 | 03.01.02 | 3 | NL | sluiten | EN | close | 05/06/2024 | 05/06/2024 |
I want to transform this table (could be thousands of rows) to the following JSON structure(file attached below as well):
Is this even possible in the first place?
Can someone help me out?
Thank you!
Solved! Go to Solution.
06-07-2024 12:13 PM
@adam_g - since there is no defined max level, the solution uses some advanced methodology and I will do my best to explain in some detail. First, you should familiarize yourself with Expression Libraries, Arrow Functions, jsonPath, the Spread and Ternary Operators, Object Functions, and Array Functions such as map() and filter(). These are advanced topics that may take you some time to feel comfortable with, and I will at least explain each of the functions that I used in the example pipeline that I have attached.
Please download the zip and decompress it. Upload the parentChild.expr into your SLDB into the same project where you import the pipeline.
The first snap after the Excel Parser is Group By Fields which simply pivots the data from individual records to being grouped by the specified complex primary key defined in the settings.
Next is the Mapper, which first moves the $groupBy fields to the root level. Then the "MaxLevel" is determined by using the Math.max() function against the array of "Level" values. Note the use of the Spread Operator to explode the array into individual values for the method to work from. jsonPath is used here to find all the "Level" values found within the array that was just created by the Group By Fields snap. Array.map() is also used to convert the "Level" from string to integer using the built-in parseInt() function.
The "Level1" value is where some real magic happens. Since we don't know how many levels deep we need to build, we need to use a bit of recursion to make it work. This is where the expression library and arrow functions help us out. Note that the call is pretty simple, just calling the mapLevel() function in the library, passing it the "Root" array (built from the Group By Fields snap), the starting level number, and the "ParentNumber" that we want to start from - which in this case are those coming in as "NULL". Be sure to include the parentChild.expr file in your SLDB in the same project where you import the pipeline - this expr file is where all the logic lies to build your JSON.
mapLevel : (rootArray, level, pNum) =>
This simply defines the function "mapLevel" with 3 parameters, which we saw when we called it in the Mapper.
[].concat(rootArray)
The first thing I want to do is make a copy of the array that is passed in. Since we need to filter out only the parts of the array we're interested in based on ParentNumber, I don't want to accidentally affect my original array since I need to look through it multiple times for the different values we need to build.
.filter(elem=> typeof elem == 'object' && elem.ParentNumber == pNum)
This is the filter to find only the elements related to the ParentNumber we're interested in.
.map((elem,idx,arr)=>
Now we are going to rebuild each array element (now filtered down to the ParentNumber we're working on) into the structure you defined.
.extend(
This is the Object.extend() method to add elements into the Object we just created for each array element.
rootArray.findIndex(x=> x.ParentNumber == elem.CurrentNumber) < 0
We use the Array.findIndex() method to check to see if there is another level we need to dive into by checking to see if there are any elements in the original array where the ParentNumber is the same as the CurrentNumber value of the filtered array we're currently working in.
: { ["Level" + (level + 1)] : __root__.mapLevel (rootArray, level + 1, elem.CurrentNumber) }
Finally, here is the real magic of this library... recursion! Thanks to the Array.findIndex() and ternary operator, we know there is another level to dive into. So we'll build that next level by calling the "mapLevel" function again, but this time passing the CurrentNumber value of the filtered array element we're working in. The "__root__" designation just let's me access the function name again, see Object Literals for more information.
Hope this helps!
06-07-2024 04:29 AM
@adam_g - couple questions to clarify the requirements as it will affect how this needs to be solved.
Happy Friday! Hopefully we can get you a solution quickly!
06-07-2024 05:02 AM
Hi @koryknick,
Thank you for your commenting!
I hope this answered your questions. English is not my native language, it could be that I misunderstood the 2nd question.
Happy Friday to you too!
06-07-2024 12:13 PM
@adam_g - since there is no defined max level, the solution uses some advanced methodology and I will do my best to explain in some detail. First, you should familiarize yourself with Expression Libraries, Arrow Functions, jsonPath, the Spread and Ternary Operators, Object Functions, and Array Functions such as map() and filter(). These are advanced topics that may take you some time to feel comfortable with, and I will at least explain each of the functions that I used in the example pipeline that I have attached.
Please download the zip and decompress it. Upload the parentChild.expr into your SLDB into the same project where you import the pipeline.
The first snap after the Excel Parser is Group By Fields which simply pivots the data from individual records to being grouped by the specified complex primary key defined in the settings.
Next is the Mapper, which first moves the $groupBy fields to the root level. Then the "MaxLevel" is determined by using the Math.max() function against the array of "Level" values. Note the use of the Spread Operator to explode the array into individual values for the method to work from. jsonPath is used here to find all the "Level" values found within the array that was just created by the Group By Fields snap. Array.map() is also used to convert the "Level" from string to integer using the built-in parseInt() function.
The "Level1" value is where some real magic happens. Since we don't know how many levels deep we need to build, we need to use a bit of recursion to make it work. This is where the expression library and arrow functions help us out. Note that the call is pretty simple, just calling the mapLevel() function in the library, passing it the "Root" array (built from the Group By Fields snap), the starting level number, and the "ParentNumber" that we want to start from - which in this case are those coming in as "NULL". Be sure to include the parentChild.expr file in your SLDB in the same project where you import the pipeline - this expr file is where all the logic lies to build your JSON.
mapLevel : (rootArray, level, pNum) =>
This simply defines the function "mapLevel" with 3 parameters, which we saw when we called it in the Mapper.
[].concat(rootArray)
The first thing I want to do is make a copy of the array that is passed in. Since we need to filter out only the parts of the array we're interested in based on ParentNumber, I don't want to accidentally affect my original array since I need to look through it multiple times for the different values we need to build.
.filter(elem=> typeof elem == 'object' && elem.ParentNumber == pNum)
This is the filter to find only the elements related to the ParentNumber we're interested in.
.map((elem,idx,arr)=>
Now we are going to rebuild each array element (now filtered down to the ParentNumber we're working on) into the structure you defined.
.extend(
This is the Object.extend() method to add elements into the Object we just created for each array element.
rootArray.findIndex(x=> x.ParentNumber == elem.CurrentNumber) < 0
We use the Array.findIndex() method to check to see if there is another level we need to dive into by checking to see if there are any elements in the original array where the ParentNumber is the same as the CurrentNumber value of the filtered array we're currently working in.
: { ["Level" + (level + 1)] : __root__.mapLevel (rootArray, level + 1, elem.CurrentNumber) }
Finally, here is the real magic of this library... recursion! Thanks to the Array.findIndex() and ternary operator, we know there is another level to dive into. So we'll build that next level by calling the "mapLevel" function again, but this time passing the CurrentNumber value of the filtered array element we're working in. The "__root__" designation just let's me access the function name again, see Object Literals for more information.
Hope this helps!
06-11-2024 07:17 AM
Hi @koryknick,
This is absolutely amazing! Thank you so much! I never would have figured this out on my own.
Your explanation is super helpful. I'll dive into all the links you shared.
Cheers!