Forum Discussion

adam_gataev

New Contributor II

2 years ago

Solved

Creating complex nested JSON from CSV file

I am reading a CSV/XLSX file (could be an sql table as well) with the follwing structure (file attached below as well):

Branch

Year

Type

ParentNumber

CurrentNumber

Structure

Level

Language1

Language1_Value

Language2

Language2_Value

CreationDate

LastUpdate

1234

2024

AAA

NULL

12342024010000

voornaam

first name

05/06/2024

1234

2024

AAA

12342024010000

12342024010100

01.01

achternaam

last name

05/06/2024

1234

2024

AAA

NULL

12342024020000

geboortedatum

birthday

05/06/2024

1234

2024

AAA

NULL

12342024030000

geslacht

gender

05/06/2024

1234

2024

AAA

12342024030000

12342024030100

03.01

opslaan

save

05/06/2024

1234

2024

AAA

12342024030100

12342024030101

03.01.01

versturen

send

05/06/2024

1234

2024

AAA

12342024030100

12342024030102

03.01.02

sluiten

05/06/2024

I want to transform this table (could be thousands of rows) to the following JSON structure(file attached below as well):

{

"Branch": 1234,

"Year": 2024,

"Type": "AAA",

"MaxLevel": 3,

"Level1": [

{

"ParentNumber": "NULL",

"CurrentNumber": 12342024010000,

"Structure": "01",

"Level": 1,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "voornaam"

{

"Language2": "EN",

"Language2_Value": "first name"

}

"LastUpdate": "2024-05-06",

"Level2": [

{

"ParentNumber": 12342024010000,

"CurrentNumber": 12342024010100,

"Structure": "01.01",

"Level": 2,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "achternaam"

{

"Language2": "EN",

"Language2_Value": "last name"

}

"LastUpdate": "2024-05-06"

}

]

{

"ParentNumber": "NULL",

"CurrentNumber": 12342024020000,

"Structure": "02",

"Level": 1,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "geboortedatum"

{

"Language2": "EN",

"Language2_Value": "birthday"

}

"LastUpdate": "2024-05-06"

{

"ParentNumber": "NULL",

"CurrentNumber": 12342024030000,

"Structure": "03",

"Level": 1,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "geslacht"

{

"Language2": "EN",

"Language2_Value": "gender"

}

"LastUpdate": "2024-05-06",

"Level2": [

{

"ParentNumber": 12342024030000,

"CurrentNumber": 12342024030100,

"Structure": "03.01",

"Level": 2,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "opslaan"

{

"Language2": "EN",

"Language2_Value": "save"

}

"LastUpdate": "2024-05-06",

"Level3": [

{

"ParentNumber": 12342024030100,

"CurrentNumber": 12342024030101,

"Structure": "03.01.01",

"Level": 3,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "versturen"

{

"Language2": "EN",

"Language2_Value": "send"

}

"LastUpdate": "2024-05-06"

{

"ParentNumber": 12342024030100,

"CurrentNumber": 12342024030102,

"Structure": "03.01.02",

"Level": 3,

"Translations": [

{

"Language1": "NL",

"Language1_Value": "sluiten"

{

"Language2": "EN",

"Language2_Value": "close"

}

"LastUpdate": "2024-05-06"

}

]

}

]

}

"CreationDate": "2024-05-06",

"LastUpdate": "2024-05-06"

}

Is this even possible in the first place?

Can someone help me out?

Thank you!

input.xlsx10 KB

output.json3 KB

Aleksandar_A
3 years ago
Hello @negi_sagar,

You can try by using a Mapper Snap with Pass-through enabled:

For Choices try using the below expression:

$Choices.extend({[Math.max(...$Choices.keys().map(x => parseInt(x))).toString()]:$Choices.get(Math.max(...$Choices.keys().map(x => parseInt(x))).toString()).extend({"TextEntry":"true"})})

Sample pipeline:
SL-comm-add-field-to-max-in-json_2023_03_20.slp (4.4 KB)

Let me know if this helps you.

BR,
Aleksandar.
negi_sagar
3 years ago
Thank you Aleksandar for answering this quickly and it worked.

Can you tell me some resources from where I learn to create these expressions?

6 Replies

koryknick
Employee
2 years ago
adam_gataev - since there is no defined max level, the solution uses some advanced methodology and I will do my best to explain in some detail. First, you should familiarize yourself with Expression Libraries, Arrow Functions, jsonPath, the Spread and Ternary Operators, Object Functions, and Array Functions such as map() and filter(). These are advanced topics that may take you some time to feel comfortable with, and I will at least explain each of the functions that I used in the example pipeline that I have attached.

Please download the zip and decompress it. Upload the parentChild.expr into your SLDB into the same project where you import the pipeline.

The first snap after the Excel Parser is Group By Fields which simply pivots the data from individual records to being grouped by the specified complex primary key defined in the settings.

Next is the Mapper, which first moves the $groupBy fields to the root level. Then the "MaxLevel" is determined by using the Math.max() function against the array of "Level" values. Note the use of the Spread Operator to explode the array into individual values for the method to work from. jsonPath is used here to find all the "Level" values found within the array that was just created by the Group By Fields snap. Array.map() is also used to convert the "Level" from string to integer using the built-in parseInt() function.

The "Level1" value is where some real magic happens. Since we don't know how many levels deep we need to build, we need to use a bit of recursion to make it work. This is where the expression library and arrow functions help us out. Note that the call is pretty simple, just calling the mapLevel() function in the library, passing it the "Root" array (built from the Group By Fields snap), the starting level number, and the "ParentNumber" that we want to start from - which in this case are those coming in as "NULL". Be sure to include the parentChild.expr file in your SLDB in the same project where you import the pipeline - this expr file is where all the logic lies to build your JSON.

mapLevel : (rootArray, level, pNum) =>

This simply defines the function "mapLevel" with 3 parameters, which we saw when we called it in the Mapper.

[].concat(rootArray)

The first thing I want to do is make a copy of the array that is passed in. Since we need to filter out only the parts of the array we're interested in based on ParentNumber, I don't want to accidentally affect my original array since I need to look through it multiple times for the different values we need to build.

.filter(elem=> typeof elem == 'object' && elem.ParentNumber == pNum)

This is the filter to find only the elements related to the ParentNumber we're interested in.

.map((elem,idx,arr)=>

Now we are going to rebuild each array element (now filtered down to the ParentNumber we're working on) into the structure you defined.

.extend(

This is the Object.extend() method to add elements into the Object we just created for each array element.

rootArray.findIndex(x=> x.ParentNumber == elem.CurrentNumber) < 0

We use the Array.findIndex() method to check to see if there is another level we need to dive into by checking to see if there are any elements in the original array where the ParentNumber is the same as the CurrentNumber value of the filtered array we're currently working in.

: { ["Level" + (level + 1)] : __root__.mapLevel (rootArray, level + 1, elem.CurrentNumber) }

Finally, here is the real magic of this library... recursion! Thanks to the Array.findIndex() and ternary operator, we know there is another level to dive into. So we'll build that next level by calling the "mapLevel" function again, but this time passing the CurrentNumber value of the filtered array element we're working in. The "__root__" designation just let's me access the function name again, see Object Literals for more information.

Hope this helps!
Nested JSON from CSV.zip2 KB
- adam_gataev
  New Contributor II
  2 years ago
  Hi koryknick,
  This is absolutely amazing! Thank you so much! I never would have figured this out on my own.
  Your explanation is super helpful. I'll dive into all the links you shared.
  Cheers!
- adam_gataev
  New Contributor II
  2 years ago
  Hi koryknick,
  It got a bit more complicated. The SQL database contains different Branch, Year and Type values. The user has the option to send the Branch, Year and Type value as parameters to filter the database. Branch and Year are optional, but the Type is mandatory and has to be given by the user in the parameter in all cases. If the user decides to only give the the Type for example, I have to return all Branch and Year combination JSON objects that have that Type value. For example, the user sends parameters Type="AAA", then this could be the result:
  { { "Branch": 123, "Year": 2023, "Type": "AAA", ... "Level1": [...], "CreationDate": "2024-05-06", "LastUpdate": "2024-06-17" }, { "Branch": 9, "Year": 2023, "Type": "AAA", ... "Level1": [...], "CreationDate": "2024-05-06", "LastUpdate": "2024-06-17" }, { "Branch": 1, "Year": 2024, "Type": "AAA", ... "Level1": [...], "CreationDate": "2024-05-06", "LastUpdate": "2024-06-17" }, { "Branch": 7, "Year": 2017, "Type": "AAA", ... "Level1": [...], "CreationDate": "2024-05-06", "LastUpdate": "2024-06-17" }, ... }
  Same logic in case I get the other parameters as well. I basically have to filter them first and then for every Branch, Year combination, do the library expression mapping like before.
  Any idea what I should add or change to achieve this? I was thinking of using a child pipeline, but how would I then give all the rows to the child pipeline...
  
  Thank you so much in advance!
koryknick
Employee
2 years ago
adam_gataev - it occurred to me over the weekend that this could also be solved using a JSON Generator snap and the Apache Velocity Engine syntax. I don't have time to produce this example for you but I have tested some very basic concepts and am confident that it would work using a Velocimacro to handle the recursive nature of the data.

You could also try using a Script snap to construct your hierarchical JSON from the array as I had created it from the Group By Fields snap. Here is an article that might help with that. Keep in mind that you need to follow the template given by the Script snap, but the logic to create the complex structure should be usable.
koryknick
Employee
2 years ago
adam_gataev - couple questions to clarify the requirements as it will affect how this needs to be solved.

How many levels are possible? If it always maxes out at 3, the solution is a bit simpler - if it is completely arbitrary and could extend into the dozens, then it is a more complicated solution

Will Branch, Year, and Type always be the key components of your record sets?

Happy Friday! Hopefully we can get you a solution quickly!
- adam_gataev
  New Contributor II
  2 years ago
  Hi koryknick,
  Thank you for your commenting!
  Unfortunately, MaxLevel is not pre-set nor is there a hard limit. It is simply the maximum value of the Level column from the CSV/SQL table.
  Yes, all fields in the JSON and the SQL table will always be present. The JSON structure, including fields at the root level or deeper levels, will remain consistent. The only variation is the depth of the levels or whether a row contains a deeper level.
  I hope this answered your questions. English is not my native language, it could be that I misunderstood the 2nd question.
  Happy Friday to you too!

Forum Discussion

Creating complex nested JSON from CSV file

6 Replies

Recent Discussions

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?

Multipart Reader failure - 'content-type' was not found