Directory Browser vs File Poller snap

Hi,
I am new to Snaplogic. I would like to read all the files in a specific directory at specific intervals. I am evaluating two options. Would like to know the pros and cons of each.
Option 1: Directory Browser snap with Task
Option 2: File Poller snap

Also I would like to make sure that the processing of the pipeline is completed before the next polling/browsing of files.

What are your intervals? If less than 5 minutes, I recommend the File Poller approach and move files to a “working” directory to ensure the same files aren’t polled for multiple runs. Keep in mind that the file poll will run continuously until the timeout is reached. So even if it finds files to process, it will send those on to the next snap and continue polling. Depending on how long it takes to process your files, it is possible to poll the same files before you finish, so this may be challenging for you to implement if you aren’t familiar with the usage.

If your interval is more than 5 minutes, you can use a Directory Browser in your pipeline. Create the task as scheduled and enable the “Do not start a new execution if one is already active” option to prevent multiple instances. This is a simpler pattern and easier to implement.

Directory Browser doesn’t actually read the files. It just returns the list of files in a directory. So you’d have to use a File Reader for each of the files output by the Directory Browser.

I think what you want is the Mutli File Reader instead. It basically combines the Directory Browser + File Reader functionality into one snap and it’s far more efficient. But be aware that it has limited functionality during Validation – I think it only reads one file. But if you Execute the pipeline, it will read them all.

Kory, on File Poller Snap we have Only Output on Change check box. Only When there is a change it will o/p.

From Documentation

Only Output on Change

Select this check box to instruct the Snap to provide an output only when there is a change in the contents of the polled directory. When selected, the Snap provides an output during its initial run if it finds matching documents. However, it provides polling results in the next run only if the polled directory has newer files that match the pattern specified.

Thanks @skatpally - I should have double-checked the documentation on the File Poller snap. It’s been a while since I’ve used it.

@mramaswamy - back to your original question. I think the Multi-File Reader snap is probably your simplest option to do what it sounds like you are doing.

The File Poller is typically used in an “always on” situation to poll continuously during a pipeline execution, for example in an Ultra Pipeline where it might poll the directory many times a minute, 24 hours a day.

Thanks for all your help. This is really helpful. I have not tried multi-file reader. At this point we are comfortable with Directory Browser + File Reader + Task scheduler. I will keep posted after we try multi-file reader.