cancel
Showing results for 
Search instead for 
Did you mean: 

Mapper dynamic content

Tuomas
New Contributor II

Hello!

I have REST API source from which I ingest data and load it to BigQuery. I am executing this dynamically with parameters as there are multiple endpoints to query. I am using mapper to define the schema before loading, in this case with the pass-through enabled. However, I have run into a problem, that sometimes the API doesn’t return certain object in the json. Then, when I auto-create the table during load, it may not define this missing column in the schema based on the sample, but then breaks when this column suddenly appears. Due to this being a dynamic pipeline, I cannot hard-code any single columns to the mapper, as the column names are different for each job. Also, the number of columns might differ.

So is there a way to dynamically define the column names I want to keep from the source? For example I a good solution could be one where I can define a single json, array or some other sort of string that contains the namespace of columns I’d like to keep, and just give that as a parameter to the mapper.

Just to showcase with an example what I would want out with the dynamic expression:

Source 1:
[
{
“a”:“123”,
“b”:“qwerty”,
“c”:“aaa”,
“d”:asd",
“e”:“111”
},
{
“a”:“456”,
“b”:“zxcvbn”,
“c”:“bbb”,
“d”:fgh",
“e”:“222”
}
]

Target schema 1:
a, b, c, e

Source 2:

{
“h”:“123”,
“i”:“qwerty”,
“j”:“aaa”,
“k”:asd",
“l”:“111”
},

Target schema 2:
i, j, k

Hopefully you get what I’m after 🙂 Thanks

-Tuomas

1 ACCEPTED SOLUTION

AleksandarAngel
Contributor II

Hi @Tuomas,

One way that you can try is to define the columns for each schemas and to append that object to the response documents.

Schemas columns object example:

[
    {
        "schema1":['a','b'],
        "schema2":['c','d'],
        "schema3":['e','f']
    }
]

Also, you can make this as part of a config file.

Then you can append this object to every document using the Join Snap with the following configuration:
image

So, finally you can dynamically filter out the columns from the root object using the following expression in a Mapper Snap:

$.filter((v,k) => $.get(_schema).indexOf(k) != -1)

Sample pipeline:
–sl-comm-dynamic-columns-parameter_2023_05_12.slp (6.8 KB)

Let me know if this helps you @Tuomas.

BR,
Aleksandar.

View solution in original post

5 REPLIES 5

AleksandarAngel
Contributor II

Hi @Tuomas,

One way that you can try is to define the columns for each schemas and to append that object to the response documents.

Schemas columns object example:

[
    {
        "schema1":['a','b'],
        "schema2":['c','d'],
        "schema3":['e','f']
    }
]

Also, you can make this as part of a config file.

Then you can append this object to every document using the Join Snap with the following configuration:
image

So, finally you can dynamically filter out the columns from the root object using the following expression in a Mapper Snap:

$.filter((v,k) => $.get(_schema).indexOf(k) != -1)

Sample pipeline:
–sl-comm-dynamic-columns-parameter_2023_05_12.slp (6.8 KB)

Let me know if this helps you @Tuomas.

BR,
Aleksandar.

Seems to work like a charm 🙂 Thank you!

Tuomas
New Contributor II

A follow-up question:

Any chance I could rename some of the columns at the same time using similar logic? 🙂

BR,
Tuomas

Hi @Tuomas,

Please take a look at the attached pipeline below:

–sl-comm-dynamic-columns-parameter_2023_05_17.slp (8.3 KB)

Let me know if this helps.

BR,
Aleksandar.