โ01-02-2018 06:32 AM
I have a very big file (around 2gb) . I have to fetch that from FTP and have to split the file into chunks (letโs say 2 million records ) and create new files in the FTP . Please suggest any effective method of processing the data. I tried using router and sequence. Is there any better way to do it .
โ01-02-2018 01:30 PM
Have you tried group by N snap to split it based on count?
โ01-03-2018 01:43 AM
Hi @aleung ,
We have implemented this with the sequence generator and Router. It took 1 hour to process the whole file and creates new files with 2.5 million each which is pretty good .Still I am curious to know whether we can tune this to less time .could you help on this .
โ01-03-2018 10:46 AM
This type of job is best at the endpoints instead of snap. Python is really good at this kind of job, so if you can do the splitting either prior or post transfer is best. I know this isnโt ideal, but canโt think of a better way.