In our Organization, SnapLogic is one of many tools used to integrate data. We use an enterprise scheduler to manage the execution alerting, and dependencies across all platforms and technologies. We use Cisco’s Tidal Enterprise Schedule to execute SnapLogic Pipelines, SSIS Packages, Informatica Workflows, FTP File Movements, Command line executables, etc.
In order to expose a pipeline to an external scheduler, we create a triggered task and give the exposed API URL to the Webservice adapter within Tidal. Tidal will execute the pipeline and get a response of “200 - OK” because the pipeline task successfully triggered. This doesn’t tell us that the pipeline finished successfully, just that it kicked off successfully.
In order to catch failures, we use System Center Operations Manager to call the summary pipeline status API. It will return one or more failures that are then sent to our IT Operations team that will triage and notify responsible parties.
We’ve been running this way for a while and it’s been working well enough. Now we’re exposing SnapLogic to more projects and more development groups and as a result the demands on the successful executions and downstream dependencies have increased. We need our scheduler to know when jobs succeed, fail, or run long and we need each team to be notified of their own pipeline failures.
From here on I’m talking theory. I’m very interested in what others have come up with as a solution to enterprise scheduling
Since the only response we get back to the scheduler in a REST API call, is 200 - OK, we can’t rely on this to determine whether the job was successful or not. SnapLogic has published a set of APIs to return the given status of an individual pipeline. If we can get our scheduler to be dependent on the status of a subsequent status call, then we should be able to alert accordingly.
To accomplish this, I’m attempting to implement the following (haven’t connected all the dots yet):
- Add a mapper to each parent pipeline that has an open output and returns the URL used to monitor this pipeline (+pipeline.ruuid)
- Create a tidal job (a) to call the initial pipeline task that will do the actual integration.
- Create a tidal job (b) that is dependent on (a)'s success that will call the monitoring URL returned from (a) repeatedly at a short interval and logs the return code to a Tidal variable.
- If (b) returns “Running”, keep trying. If (b) returns “Failed”, fail the job. If (b) returns success, mark job as successful.
- Create tidal © that is the next actual integration that is dependent on both the success of (b) and a value of “Success” in the tidal variable.
This is quite a bit of tedium just to handle the success of failure of a job and I’ve not yet successfully implemented this solution, I feel like it’s with reach.
What solutions have other come up with for managing dependency and alerting across your enterprise?