Forum Discussion

ash42's avatar
ash42
New Contributor III
4 years ago

Parsing large files

Hey all,

I am trying to tread a File (260+Mb) and parse it and and then insert the data to a Table. And when i m using SQL Bulk load The csv parser always seems to be stuckt at output 4100 docs

and when i m using normal insert table snap , this entire process taking 20-25mins

File format sample :

CSV Parser

PS: it works fine and quick with smaller size files

3 Replies

  • bojanvelevski's avatar
    bojanvelevski
    Valued Contributor

    The difference between Insert and Bulk is that Insert writes down every record separately, and Bulk does it in chunks, so when a first chunk is ready, Bulk loads that chunk into the DB. Maybe that’s the hold you are facing, and you didn’t wait enough? I am also curious about the join, and what’s going on in there, because Bulk Load will result in only one object on output, and that’s not the original. Also, what’s the snap after the CSV Parser?

    Definitely the lock is not on the CSV parser, so you can try and start disconnecting the snaps one after the other to check where’s your problem.

    • ash42's avatar
      ash42
      New Contributor III

      Thanks,seems like it was problem with, the reason I used join with tail snap is because i do not want the pipeline to continue executing untill the insert operation is fully completed, So used tail snap to act as a waiting snap

      this is the updated pipeline

      • bojanvelevski's avatar
        bojanvelevski
        Valued Contributor

        This might, and probably will result in proceeding with the pipeline even before the data is loaded into the table. The Join is one of the solutions to achieve what you need, but you need to be careful while configuring. Combine both, the new setup + Join (1on1)

        That will result in having one object on output, it will wait both inputs to complete and will move the execution forward.