This seems like a fun one to experiment with… you state “based on the total records”, does this mean you need to identify the number of records and split them evenly among a known file count, or do you have a max amount of records per file and you will be writing the max number records out to an unknown file count… I was thinking of adding a mapper that provides a row number, create a router where rows <= max rowcount goes down one output to a file and the rest goes out another output that calls the pipeline again. Where the process would start all over with fewer records and continue until no records in the file. I am thinking there would be a file counter passed that is incremented as well so you could get file1, file2 etc. Never tried but I think its worth a shot