cancel
Showing results for 
Search instead for 
Did you mean: 

Handling missing object properties with Mapper

Tuomas
New Contributor II

Hello,

I've been struggling with this for few hours now and I simply can't get it to work as I want to. Therefore I'm asking for help. So I have the following input incoming to Mapper:

{
"id": 1,
"labels": {
    "fi_FI": "label_1"
}
},
{
"id":2,
"labels": {
    "fi_FI": "label_2",
    "en_US": "label_3"
}
}

So I have a missing en_US label on the first object. How do I add a null or blank en_US label? So the output would look like:

{
"id": 1,
"labels": {
    "fi_FI": "label_1",
    "en_US": null
}
},
{
"id":2,
"labels": {
    "fi_FI": "label_2",
    "en_US": "label_3"
}
}

There can be possibly be X more labels-properties. Fully dynamic solution would be the best, but I could work with determining all the labels-properties as well.

Cheers!

 

 

3 REPLIES 3

koryknick
Employee
Employee

@Tuomas - You could simply enable the "Null-safe access" checkbox in the Mapper.  With that enabled, whenever you reference a field that doesn't exist, it simply assigns it as null.

Tuomas
New Contributor II

Hi @koryknick ! Thanks for response!

Yes, I have the Null-safe access enabled. In this case it doesn't work though. I have "labels" as my output field, not "fi_FI" and "en_US". Thus, when "labels" in input is completely missing, then I will have null output for "labels" but also I'm missing both the properties. Sure, one way to do it would be to map each property to its own output field which I have done in the past when facing the similar problem. But I figured maybe I should find an alternative solution as I don't really need the "labels"-object to be split. Also, if I hard code the "labels"-object, I might lose something in the future if more properties are suddenly added to it.

Essentially, the missing properties were problematic when loading data to Big Query. An object with one of the properties missing happened to be the first object loaded which the connector apparently uses to determine Big Query table's schema. Big Query adds objects to table as "labels.fi_FI", "labels.en_US", therefore a property missing from first object created the table without that column and breaking the load for subsequent objects.

Hopefully this clarified the issue 🙂

koryknick
Employee
Employee

@Tuomas - thanks for the clarification.  As SnapLogic is a streaming platform, it uses the first record to build the list of columns used for all subsequent records for many endpoints.  Attached is a pipeline I created for you to ensure all records have all fields contained in the "labels" field of the input documents.  Note that this does not do a deep copy of the structure, so if you have nested objects within "labels", you could wind up with the same issue depending on how your target system functions.  Please download the zip file, decompress it, and import the pipeline to view the solution in full.

koryknick_0-1702314735728.png

There are a few sections here that I want to provide some description. 

For "Map fieldNames", the "$lables.keys()" expression simply returns the list of field names from the $labels sub-object of your input document.  The rest of that bottom path is simply gathering up a complete list of field names that are discovered across all of your input documents.  

In the "Map all fieldNames", let me break down that expression:

jsonPath($fieldNames, "$[*].fieldName")
.toObject(cur=> cur, cur=> null)
.merge($labels)

jsonPath let's you reference object sub-elements, returning an array of those elements within the input document.  I recommend following the link to external documentation for even more information.

The Array.toObject() method converts the entire array into an Object.  Since our array is a list of field names, this call is simply creating an object with all field names with a null value for each.

Finally, the Object.merge() method does a deep merge of the original input document with the new "nulled Object" we just created from the list of field names.

Hope this helps!