Support for tasks with folder polling

We still see that there are quite a lot of integrations where we depend on files to be sent and picked up.

I find no good solution of triggering tasks by detecting files in a folder other than to build one or many pipelines that runs frequently with directory browser that check files from a folder and then route the data to the pipeline that will handle the transformation/execution.

And even if i set a schedule interval to run every minute, it is not very reliable, as you can see on the start time the interval is quite far from every minute :slight_smile:.

image

This type of setup will quickly fill up the dashboard with a lot of executions, especially if we need to poll frequently.

It would be awesome if there was a way to trigger a pipeline if a file is added in a folder…
With support for the common protocols that we have in the file snaps and (reliable) customizable intervals for polling, this simply need to send the same type of information to the pipeline as we have from a directory browser snap.

I would be happy to elaborate on a suggested setup but not sure if i’m the only one with this “problem”.

Hello JKarlsson and welcome to the SnapLogic Community! To make sure I’m clear on the use case, you need to call a Triggered task every time a file is placed in a specific folder, correct? If that is correct, I suggest you use the File Poller snap (https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/1438806/File+Poller) with a “Polling timeout” value of -1 so the snap polls indefinitely. Connected to that snap you could have a Pipeline Execute to call the next pipeline on each file received or use a REST snap to call a Triggered task. To make the File Poller pipeline more resilient, you can set it up as a Scheduled task and make sure to check the box that prevents another instance from starting if one is already running.

Hi rsramkoski
Thanks, i will give this a try and get back with results!
This requires quite strict error handling if pipeline execute is used. The REST snap would allow the sub pipelines to be executed more independently and any possible errors within them would not affect the poller pipeline.

Hello JKarlsson,

Thank you for the update and please do keep us posted. If the child pipeline that processes the files has its own error handling pipeline the errors in the child should not roll up to the parent, which should allow the poller pipeline to continue functioning without impact.

Another consideration on this topic would be whether or not you want the pipeline statistics available for every file processed. If the File Poller parent pipeline is calling child pipelines, then we will only show the stats for the previous 100 child pipelines in Dashboard, so please keep that in mind when thinking about the design as well. If you need the stats for every file processed you could use a REST snap to call a Triggered task so every file or group of files processed has its own pipeline.

Hi,
We have been running two input detections with file pollers now for a while and during our tests.
It seems to work quite well on an overall level.
We have been running with child pipelines but i agree that the visibility gets affected in this scenario, we will most likely adjust to REST to call triggered task.

Sub pipelines are being cached, so if anything is adjusted on them the parent pipeline needs to be stopped and restarted by the scheduled task, this will most likely also be solved by calling sub pipelines with REST.

It feels a bit like a workaround to have it running like this compared to our old integration platform with multiple communication channel triggers both on files and/or content.

Some parts that i noticed that is not directly related to the original topic:
The scheduler is very unpredictable, even if it is set to try to start every 1 minute it can take between 3-5 minutes before it triggers again.
And if i need to stop and disable a scheduled task it might trigger 1-2 times again even after it is set to disabled.
This is not unique to file poller or continous running pipelines, but rather a generic problem…