cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Directory Browser vs File Poller snap

mramaswamy
New Contributor II

Hi,
I am new to Snaplogic. I would like to read all the files in a specific directory at specific intervals. I am evaluating two options. Would like to know the pros and cons of each.
Option 1: Directory Browser snap with Task
Option 2: File Poller snap

Also I would like to make sure that the processing of the pipeline is completed before the next polling/browsing of files.

7 REPLIES 7

koryknick
Employee
Employee

What are your intervals? If less than 5 minutes, I recommend the File Poller approach and move files to a โ€œworkingโ€ directory to ensure the same files arenโ€™t polled for multiple runs. Keep in mind that the file poll will run continuously until the timeout is reached. So even if it finds files to process, it will send those on to the next snap and continue polling. Depending on how long it takes to process your files, it is possible to poll the same files before you finish, so this may be challenging for you to implement if you arenโ€™t familiar with the usage.

If your interval is more than 5 minutes, you can use a Directory Browser in your pipeline. Create the task as scheduled and enable the โ€œDo not start a new execution if one is already activeโ€ option to prevent multiple instances. This is a simpler pattern and easier to implement.

Kory, on File Poller Snap we have Only Output on Change check box. Only When there is a change it will o/p.

From Documentation

Only Output on Change

Select this check box to instruct the Snap to provide an output only when there is a change in the contents of the polled directory. When selected, the Snap provides an output during its initial run if it finds matching documents. However, it provides polling results in the next run only if the polled directory has newer files that match the pattern specified.

Thom
New Contributor II

This post is a few years old but I wanted to get clarity on this statement. You seemed to indicate that at the end of each polling interval, the output is passed downstream and polling will continue until the time out. That has not been my experience. The output for me was only passed on once the timeout was hit. Items were added to the output at each polling event if they were added to the file location being polled but not until the timeout is reached does the output move on.
Is there a configuration for the snap that will achieve the behavior you experienced?
I have experimented with different combinations of โ€œOnly Output on Changeโ€ and โ€œExit on first matchesโ€ but have not achieve the behavour as you describe.
Has your experience changed with this File Poller snap?
Thanks,

ptaylor
Employee
Employee

Directory Browser doesnโ€™t actually read the files. It just returns the list of files in a directory. So youโ€™d have to use a File Reader for each of the files output by the Directory Browser.

I think what you want is the Mutli File Reader instead. It basically combines the Directory Browser + File Reader functionality into one snap and itโ€™s far more efficient. But be aware that it has limited functionality during Validation โ€“ I think it only reads one file. But if you Execute the pipeline, it will read them all.