cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Reading first N records from a file?

PSAmmirata
Employee
Employee

I have a file with a large number of rows. I want to be able to read the file, parse it, but stop reading the file after N rows have been parsed. Is this possible? I tried using a Head snap after the parser snap, but file reader continues to read the entire file and pipeline doesnโ€™t complete until the entire file has been read; even though Iโ€™m only interested in the first N rows.

1 ACCEPTED SOLUTION

Continuing the discussion from Reading first N records from a file?:

Ok, thanks for the explanation. That makes sense. Your issue isnโ€™t really with the fact that snaps upstream (the File Reader + CSV Parser, or whatever) keep running. Itโ€™s with the fact that the snaps downstream (a Formatter + File Writer, perhaps) do โ€“ they donโ€™t complete (write the file) as soon as the Head snap has written the only document it will write.

So, yes, thereโ€™s actually a simple fix we can make to the Head snap to do just that: close the output view as soon as the desired number of documents are written. This will cause the downstream snaps to finish writing their output. I just tried it and it works as expected. I think we should be able to get this fix into our forthcoming release planned for Nov 14.

View solution in original post

11 REPLIES 11

skatpally
Former Employee

Exit snap can help, but that will mark the pipeline as failure in the Dashboard.

While not ideal, using the Exit snap helped. I need to ensure that the downstream processing of the N records completes before the Exit snap triggers. I can use the threshold limit in the Exit snap to โ€œdelayโ€ the exit a bit.

Supratim
Contributor III

@PSAmmirata Try header snap.

I believe he did.


Diane Miller
Community Manager