cancel
Showing results for 
Search instead for 
Did you mean: 

Trim last lines in csv file

Dhivya_Aroma1
New Contributor III

Team,

I have a log file in csv format. But the log file contains status information in first 5 lines and last 3 lines.

I can skip the first 5 lines using Skip lines* property in csv parser. But I could not remove the last 3 lines.

And I am getting error in csv parser because of this.

My Sample log file:

2020-02-04 21:49:01 STATUS: Task started
2020-02-04 21:49:01 STATUS: Connection established
2020-02-04 21:49:13 STATUS: Executing records…
START RECORD BLOCK
“Column1”,“Column2”,“Column3”,“Column4”,“Column5”
Value1,Value2,Value3,Value4,Value5
Value10,Value20,Value30,Value40,Value50
Value100,Value200,Value300,Value400,Value500

END RECORD BLOCK

  • Task completed successfully
    ============================================================================
    Can you help me here to parse my csv file.

Thanks,
Dhivya.

1 ACCEPTED SOLUTION

dimitri_hristov
Contributor

Hi,

I tried to recreate the error by parsing the data from the original post, and yes, the CSV Parser threw an error. Still, the desired data from the .csv file were present on the output of the CSV Parser Snap, correctly transformed into JSON format. I too skipped the first 5 lines by using the “Skip lines” property in the CSV Parser.
However, by selecting the “discard error data and continue” option in the drop-down for the Error View in the CSV Parser, the problem disappeared altogether. No error, and the CSV data was correctly parsed.

BR,
Dimitri

View solution in original post

9 REPLIES 9

koryknick
Employee
Employee

Attached is an example pipeline that will read in the full file, use Binary to Document to allow the Mapper snap to manipulate the “content” of the file. Mapper will split the content by the record separator (cr/lf), filter out records that don’t have the field delimiter (comma), rejoin the content with the record separator, then pass data through a CSV converter to get the data. The Mapper snap is doing the heavy lifting by removing all of the header and footer information before it gets sent to the CSV parser.

Note that I’ve also set the Output view to Binary in the Mapper snap, which saves a step of doing a Document to Binary prior to the CSV Parser.

I hope this helps!

Example_Read_file_with_header_and_footer_2020_06_29.slp (6.4 KB)

Hi @koryknick when I generate pipe separate file (Using CSV formatter) there is empty line been created end of the file as below… how do I remove the empty line end of the file

image

koryknick
Employee
Employee

@extiva - What you are seeing is an artifact of the viewer you are using. The line is technically not there, but the viewer is rendering it since the previous line has the CR/LF, which is typical. Most parsers would ignore this “extra” line that you are seeing.

By the way, I’ve updated the example pipeline to use data streaming logic so as not to over-utilize memory for large files. The concept is the same where it pulls individual records in and filters any that don’t contain a comma, but it will work on individual records as opposed to the original that consumed the entire dataset into memory (i.e. Binary to Document).

Example_Read_file_with_header_and_footer_2021_04_30.slp (12.9 KB)

Thank you.

sindhu
New Contributor II

Hi @extiva

Is the solution identified for this? I have the same issue for me now.