I have a very small pipeline (3 snaps) that I’m reading from a SQL table and writing the results to a new table in the same DB. I keep getting an error: Lost contact with Snaplex node while the pipeline was running. The select statement is pulling 250+M records and I’m using the Azure bulk insert to write. In order to avoid this error I keep having to reduce the batch size, from 20K to 10K to now 8K. Any thoughts on what could be causing the error?

@darshthakkar - I believe @dmiller is correct that the Shard Offsets is a custom snap. I’ve replicated the logic with core snaps and built in expressions. See the attached example. Shard Example_2023_07_05.slp (5.6 KB) The Mapper is where the magic happens. sl.range(0, $TOTAL, Math.ceil ($TOTAL / parseInt(_shard_count))) .map((val,idx,arr)=> idx == arr.length - 1 ? { offset : val, limit : $TOTAL - val } : { offset : val, limit : arr[idx + 1] - val } ) First, use the sl.range() built-in function to generate an array with the offsets to be used, then use Array.map() method to recreate the simple array of integers as an array of objects to provide both the offset and limit for each shard. After the Mapper, just use a JSON Splitter to get a new document for each limit and offset combination (same as Shard Offsets snap) that would feed your child pipeline with the Pipeline Execute. Hope this helps!

Lost contact with Snaplex node | SnapLogic - Integration Nation

18 Replies

koryknick
Employee
3 years ago
@darshthakkar - I believe @dmiller is correct that the Shard Offsets is a custom snap. I’ve replicated the logic with core snaps and built in expressions. See the attached example.
Shard Example_2023_07_05.slp (5.6 KB)

The Mapper is where the magic happens.

sl.range(0, $TOTAL, Math.ceil ($TOTAL / parseInt(_shard_count))) .map((val,idx,arr)=> idx == arr.length - 1 ? { offset : val, limit : $TOTAL - val } : { offset : val, limit : arr[idx + 1] - val } )

First, use the sl.range() built-in function to generate an array with the offsets to be used, then use Array.map() method to recreate the simple array of integers as an array of objects to provide both the offset and limit for each shard.

After the Mapper, just use a JSON Splitter to get a new document for each limit and offset combination (same as Shard Offsets snap) that would feed your child pipeline with the Pipeline Execute.

Hope this helps!
- darshthakkar
  Valued Contributor
  3 years ago
  Thanks a TON @koryknick for your help on this.
- ashis1upreti
  New Contributor II
  2 years ago
  Hello koryknick, can you please help in my case. I am facing the same issue, due to large vol of data and I did not quite get how to do. Where do I add these mappers. I am attaching some screenshots herewith. So basically, my child pipeline reads from s3uri passed from parent pipeline and when it splits that file content, it will have 2M+ records that will insert the data into the respective tables
  
  it. Attaching my child and parent pipelines, where do I modify it exactly
  - koryknick
    Employee
    2 years ago
    ashis1upreti - The error you are getting has nothing to do with transaction volume. You are using an incorrect account for the snap type. When creating or selecting accounts, I recommend you do it in Designer by using the Account dropdown list in the snap you want to use it in. Keep in mind that accounts are specific to the snap pack associated with the snap being used. So you cannot use an "AWS S3" account (which exists in the AWS S3 snap pack) in the File Reader or S3 File Reader (which are both in the Binary snap pack).
    
    In the future, please open a new thread rather than re-opening a new thread that has been resolved. This just helps keep the focus of the topic at hand. You can always refer to another thread by including a link to the previous thread if you believe it is related.
ljupcho_machko1
New Contributor III
4 years ago
Hi @msalter ,

I assume that root cause is the memory usage. In my case I was receiving below error message:

It was resolved by using Shard Offsets snap, and transferring the data in Threads, using Pipeline execute snap.
msalter
New Contributor II
4 years ago
Thanks @ljupcho_machkovski . I thought that may be the case too but CPU never peaks above 60% and memory never peaks above 35%.
darshthakkar
Valued Contributor
3 years ago
@dmiller - Can we please mark @koryknick’s comment as a solution to this thread?
Thanking you in advance.
darshthakkar
Valued Contributor
3 years ago
@ljupcho_machkovski - Can you please share more details on how you resolved this?
Thank you.
- ljupcho_machko1
  New Contributor III
  3 years ago
  @darshthakkar It was resolved by splitting the source payload in chunks - instead of processing all the records at once.
  - darshthakkar
    Valued Contributor
    3 years ago
    Did you use a pipeline execute with batch/pool size defined?
darshthakkar
Valued Contributor
3 years ago
I did check the documentation and there is nothing specified about the “Shard” snaps or anything about sharding in general.
- dmiller
  Former Employee
  3 years ago
  I believe it is a custom Snap. Trying to track it down now. 🙂
  - darshthakkar
    Valued Contributor
    3 years ago
    Thank you, @dmiller for your inputs and sharing knowledge on this.

Forum Discussion

Lost contact with Snaplex node

18 Replies

Recent Discussions

Way to lock down in Prod org to "Monitor" only access?

trace API and proxy calls

Pagination Logic Fails After Migrating from REST GET to HTTP Client Snap

Pipeline Execute Pool size

Concat values of a field based on value of another field