cancel
Showing results for 
Search instead for 
Did you mean: 

Design pipelines to run sequentially

tliljekvist
New Contributor

So, I'm building a pipeline which performs a number of API requests and writes the responses to Blob storage.
When that pipeline and it's child pipelines are done executing I want to trigger a separate pipeline to process the data that is stored in blob storage.
The pipeline looks like this before implementing trigger of separate pipeline:
pipeline-early.png

 

 

 

 


My plan was to place Tail snap followed by REST Get snap after the last Pipeline Execute snaps like this:
pipeline-late.png

 

 

 

That leads to my question:
Is this a good design or should I do this another way?
After adding the Tail snap I've had more issues with crashing nodes, which I assume stems from the Tail snap taking up resources.

Any input or feedback is highly appreciated as I feel like I might be missing something here.

Best Regards

Teddie

1 ACCEPTED SOLUTION

koryknick
Employee
Employee

@tliljekvist - The Tail snap is actually very light-weight, simply waiting for and returning the last document encountered before end of input.  When you call the Triggered Task, does the pipeline that it calls have any open output views that the REST Get snap would be waiting for a response from?  If the pipeline is returning a large amount of data through an open output view, that could be one issue.  If it does not have any open output views, then the REST Get would wait for that child to complete before allowing the calling pipeline to complete, functioning more like a Pipeline Execute.  You can resolve this by adding a Mapper in the called pipeline with no input view and an open output view with a simple response that the pipeline has started.  This allows the REST Get to receive an immediate response and close out.

However, the recommended pattern for job dependencies is to use an enterprise scheduling tool, such as Autosys, Control-M, Tidal, etc.  Most schedulers would be able to call the triggered tasks and wait for completion before calling the next job.  There is a help page for configuring schedulers to work with long-running tasks.

Hope this helps!

View solution in original post

2 REPLIES 2

koryknick
Employee
Employee

@tliljekvist - The Tail snap is actually very light-weight, simply waiting for and returning the last document encountered before end of input.  When you call the Triggered Task, does the pipeline that it calls have any open output views that the REST Get snap would be waiting for a response from?  If the pipeline is returning a large amount of data through an open output view, that could be one issue.  If it does not have any open output views, then the REST Get would wait for that child to complete before allowing the calling pipeline to complete, functioning more like a Pipeline Execute.  You can resolve this by adding a Mapper in the called pipeline with no input view and an open output view with a simple response that the pipeline has started.  This allows the REST Get to receive an immediate response and close out.

However, the recommended pattern for job dependencies is to use an enterprise scheduling tool, such as Autosys, Control-M, Tidal, etc.  Most schedulers would be able to call the triggered tasks and wait for completion before calling the next job.  There is a help page for configuring schedulers to work with long-running tasks.

Hope this helps!

@koryknick- Thanks for the information.
I'll look into the Scheduling tools you mentioned.