12-07-2021 02:46 PM
Hi,
I have a scenario to pull millions of records with many fields for each using API call in REST GET snap. Unfortunately, pipeline failing due to more volumn of records trying to get. All I need is only few key fields.
Is there a way to restrict number of field(same set of fields) for all the records, that way volumn will be much lesser and no read time out failure can happen during execution in REST GET?
Ex. I have 10 fields in API response, all I need to pull is only 3 fields.
Please help me with your suggestions.
Regards,
Amardeep Shanmugam.
12-07-2021 03:24 PM
That totally depends on whether the REST API you’re calling supports that. Who authored the REST API you’re calling? Do you have documentation for it?
12-07-2021 03:24 PM
Hi @amardeep2021,
You can’t restrict the fields in REST Get snap unless the API from where are you pulling the data can accept some sort of parameters in which you can specify the names of wanted fields, but that depends on the API.
Do you make also many requests to the API or the API is returning you large number of records with only one request ?
If you make more requests to the API to optimize this a little bit you can try first to group the data in chunks of lets say with 1000 records and pass those chunks one by one in Pipeline Execute, where you will call other pipeline in which you will split the data and make the API call, after the call with mapper you can select only those fields that you need it, and pass it back to the parent pipeline .
Regards,
Viktor
12-08-2021 09:15 AM
Thanks Taylor and Viktor.
Viktor,
I am making 1 REST GET API call to bring in million records to process them. Can you please elaborate little more on group by? I am currently using REST GET with url and trying to process required fields using mapper after that. You are suggesting to use a group by snap after REST GET and connect that to execute pipeline?
Please let me know.
Regards,
Amardeep Shanmugam.
12-08-2021 04:38 PM
My suggestion above was only if you are making large number of requests to the API, since you are making one request that would not work.
In situation like yours, it can only be done something with the things that API is providing.
You can check if the API have some parameter where you will specify wanted columns, or
see if the API have pagination.
It is most likely to have pagination because as you said you are bringing million of records and that is large number of data to be passed in one request.
If accepts pagination, check this blog, explains how to make pagination with REST Get snap.