03-17-2022 03:59 PM
I have a very small pipeline (3 snaps) that I’m reading from a SQL table and writing the results to a new table in the same DB. I keep getting an error: Lost contact with Snaplex node while the pipeline was running. The select statement is pulling 250+M records and I’m using the Azure bulk insert to write. In order to avoid this error I keep having to reduce the batch size, from 20K to 10K to now 8K. Any thoughts on what could be causing the error?
Solved! Go to Solution.
07-05-2023 10:40 AM
@darshthakkar - I believe @dmiller is correct that the Shard Offsets is a custom snap. I’ve replicated the logic with core snaps and built in expressions. See the attached example.
Shard Example_2023_07_05.slp (5.6 KB)
The Mapper is where the magic happens.
sl.range(0, $TOTAL, Math.ceil ($TOTAL / parseInt(_shard_count)))
.map((val,idx,arr)=>
idx == arr.length - 1
? { offset : val, limit : $TOTAL - val }
: { offset : val, limit : arr[idx + 1] - val }
)
First, use the sl.range() built-in function to generate an array with the offsets to be used, then use Array.map() method to recreate the simple array of integers as an array of objects to provide both the offset and limit for each shard.
After the Mapper, just use a JSON Splitter to get a new document for each limit and offset combination (same as Shard Offsets snap) that would feed your child pipeline with the Pipeline Execute.
Hope this helps!
06-22-2023 07:08 AM
Did you use a pipeline execute with batch/pool size defined?
07-04-2023 07:58 AM
Yes, pipeline execute snap has been used in combination with Shard Offsets snap.
07-05-2023 08:47 AM
Thank you @ljupcho_machkovski , I don’t even see the “Shard Offsets snap” in my Org, can you share a screenshot of that snap so that I can have a word with my CSM?
Thank you.
07-05-2023 08:50 AM
I did check the documentation and there is nothing specified about the “Shard” snaps or anything about sharding in general.
07-05-2023 08:52 AM
I believe it is a custom Snap. Trying to track it down now. 🙂