I am trying insert 70K records in loop till a condition is false. Looping logic working fine. But, I see it takes 1 hour to insert 70K records each time. PFA of the pipeline I am executing. Is there a way to perform insertion faster? Any tips would be helpful.
The only insert step I see is the Redshift - Bulk Load. Is there any other step where you are doing the insert?
You can also try increasing the Batch size and Fetch size in the account setup.
Yes it’s Bulk load redshift. It checks for a condition in redshift select snaps and if it’s true, it self execute again till the condition is false.
So the set of snaps in this pipeline will be executed again and again in loop till false.
Other than updating account settings batch and pool size, is there a way to address this slow execution.?
Hi Rajesh, I have batch and pool size in pipeline execute as 1. Looping total record count for each cycle is 70k. Any tips on what should be batch size and pool size to make this pipeline execution complete sooner? Please let me know.
You can filter your records by using a filter snap before the redshift snap. And you can use Bulk Load snap to load all those filtered records into a target table. Batch size is irrelevant for Bulk Load snap. It just loads all the records at once.
I already have a filter condition before bulk load snap to restrict records. It still executing for 43 hours. Total record count in the target table is closer to 2.5 million records.
Is there a way to reduce this execution hours?
Can u share the screenshot of your pipeline design? Or a runtime ID would be great.
Hi Aleung, PFA. Kindly let me know if this is what you are looking for.
I stopped the execution as it was running for 49 hours.