cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Help with DIFF SNAP

PayalS
New Contributor II

Hi,

I am looking for some help regarding DIFF Snap.
Below is the scenario, I am working on:

I have two Source DBs - A & B. DIFF snap is being used to identify the eligible data for Insert, Update and Delete, and accordingly operations performed in Source B as Target DB.
e.g.
Lets say 10 records are there in A
20 records are there in B
then DIFF will identify 10 records for deletion from B.

Now, the problem I am facing is,
if due to any data issue or connection failure, if no records are coming from Source A, then DIFF is identifying all records from B to flow in the deletion link, which eventually deleting all the data from B.

I dont want this to happen. I want my pipeline to stop/fail, if any such error happend in the source, so that target cannot be empty in any case.
I am using Error Pipeline, which is tracking the error fine but then not stopping the pipeline.

Is there a way, I can track the error and stop the pipeline in such scenarios of connection failure or Data Issue ?

Quick Help will be really appreciated.

Thanks in Advance,
Payal Srivastava

1 ACCEPTED SOLUTION

PayalS
New Contributor II

@koryknick @Spiro_Taleski
Thanks for your valuable comments ๐Ÿ™‚ ๐Ÿ™
I just wanna share the good news that I am finally able to crack this now.
PFB workarounds I did to make it a success:

  1. Updated Error pipeline to insert records in the DB.
  2. filtered data connection specific errors.
  3. Used exit snap with threshold value as โ€˜0โ€™.
  4. Created a separate account with batch size as 1 for this error pipeline.

By this, my error pipeline will stop the parent pipeline only for particular errors, and will continue the pipeline in all other data failures.

Thanks,
Payal Srivastava

View solution in original post

14 REPLIES 14

koryknick
Employee
Employee

You can examine the error details in your Error Pipeline and call the Exit snap immediately if it is a connection error.

Another option may be to un-check the โ€œIgnore empty resultโ€ and put a Router that looks for an empty result set. If empty, send to the Exit snap; otherwise process as usual.

I understand itโ€™s updating 70 pipelines, but it would be a very consistent and quick fix.

PayalS
New Contributor II

Thnx @koryknick for your reply. Much appreciated ๐Ÿ™

I tried this by explicitly filtering the connection errors, after inserting in the Oracle Snap, and putting a exit snap after this. But, this is only stopping that flow, not stopping the parent pipeline because of which Source B continuing sending the data to DIFF and thus deleting all the data from B.

PayalS
New Contributor II

@koryknick
This is already unchecked in my design but still not working, as its again stopping that link from Source A. But again, Source B continuing sending the data to DIFF and ultimately to delete link, thus causing emptying B. โ˜น๏ธ

I tried multiple approaches, but not getting quick fix to it. I ultimately want to send a STOP signal to my pipeline in case of any source connection issue, but after logging the error in the error table.

The problem is going to be the logging. The formatter and file writer will wait until the end of your input stream, meaning that all documents will be processed by all input datastreams before it closes.

Depending on which formatter youโ€™re using, you may be able to โ€œshort outโ€ the logging. For example, if you are using a JSON Formatter, you can use the โ€œFormat each documentโ€ option, which create a file for each document and allow you to complete logging for the one record and immediately call the Exit snap.

PayalS
New Contributor II

@koryknick
I am using error table to log the error. I tried using Oracle insert to log the error, then used filter for data connection specific error description and then used Exit snap, so that as soon as any data connection error logged into the table it will reach to exit causing pipeline to stop.
But, this also doesnโ€™t seems to be working as EXIT is not causing pipeline to fail. Its waiting for entire stream to get processed and by that time, records from source B reaching to DIFF and thus making it to the deletion flow ๐Ÿ™ Not accomplishing my task ๐Ÿ˜ข