cancel
Showing results for 
Search instead for 
Did you mean: 

How to Chunk data for REST PUT

arvindnsn
Contributor

Hello,

I am trying to source data from Oracle table to send the documents to REST PUT Snap. Typically the Oracle Fetch would return 1.7 million rows and by the time it sends all the 1.7 million for REST PUT, it times out for Authorization failure as the API has a maximum of 15 min thresh hold.

I am trying to find a solution where I can break the data into smaller chunks (may be 25K per batch and that would give around 70 batches) and send each batch one after the other in a loop to REST PUT.

Please advise if this is feasible and also a better solution,

Thanks in advance
Aravind N

3 REPLIES 3

skatpally
Former Employee

You can do this in multiple ways.

In Parent Pipeline you can do Shard Offsets and in Child Pipeline to can do fetch offset to query data and send it to Rest Put.

Also using pipeline Execute reuse and pool size is a good way to achieve parallelism and improves performance.

Using Pipeline Execute Snap with Re-use executions and Pool size needs to be set appropriately.

Useful ideas, but where do you set up a “Shard Offset” (in 33 years I’ve never heard the term), and where does the fetch offset go? And… Is the SQL query in the parent and the docstream passed to the child, or the other way around?

skatpally
Former Employee

We have a field snap where the total number of records are divided into N partitions. (ex 1-10,11-20,21-30).You can also write a script to get the partitions. In your Child Pipeline you can fetch offset accordingly.