Forum Discussion
I won’t pretend to be an expert on the File Poller snap - I usually need to play with the settings a bit to get it to work the way I want. I do believe that with the “Only Output on Change” will retrieve an initial set of files in the directory but then won’t output anything else until a file is updated or added to the directory being polled.
With the Polling Timeout of -1, the File Poller snap will not end unless the pipeline is stopped or fails. With the Only Output On Change enabled, the Polling Interval is only important if you expect the same file to have updates within the same file being processed. Those are things you need to consider in your design of how files are landing and how you are processing them. You may wish to move files to another location to be processed to prevent re-capturing files that are in-flight.
- marenas4 years agoContributor
@koryknick thank you for the response. I too am not an expert and I really appreciate all your thoughts on this.
The above settings turned out to be prone to data errors in my case. The directory has over 52k records. When I enable Only Output On Change, the pipeline kicks in every time there are changes in the contents of the directory even though the changes do not relate to the file that I am looking for, and then polls continually (due to pulling timeout -1). It eventually finds the matching file however the file is not the most updated one. I am trying this configuration below and I will play around them depending on the results. btw, the scheduled task is set to run every 5 mins for this.
I will take note of your recommendation to move files to another location for processing.