I have a “batch” process today, where a parent pipeline pulls a list of items (1000s) from some application’s REST API, and executes (pipeline execute) children pipelines for each which work the individual items. Each child process consists of downloading audio recordings from an internal application and then uploading them to an external vendor. So not super speedy… It works okay, but there is overhead spinning up the children processes, and it doesn’t effectively leverage/balance all my groundplex nodes, making it take longer to work the list than I think it should.
We recently purchased ultra, and I was thinking about making a new decoupled process, where the parent process just threw list items on a Kafka topic, and then each node has ultra pipelines running and listening to the topic, each node pulling of messages and doing work at its own pace. Unfortunately, our organization has mandated Kerberos authentication for our Kafka infrastructure, and I don’t have a decent way of doing that with SnapLogic… Does anyone have experience/examples of doing Kafka producers/consumers with SnapLogic using Kerberos authentication?
Then I was thinking instead of using Kafka, that the feedmaster process has some inherent queuing built in, so perhaps I could have my parent process fire off REST API ultra triggered tasks for each child. I don’t really need any kind of response back to the parent job for each item. And I wasn’t sure how much queuing/waiting would be tolerated in that manner, like if I just tried to fire off thousands of requests. Any thoughts/suggestions?