Forum Discussion

suresh_fm's avatar
suresh_fm
New Contributor
2 years ago
Solved

filter to remove header rows

Hi, I have flat files where the header rows are not fixed 

example file content :  abc.txt

aa

bb

cc

1|2|3|4

where aa ,bb,cc are header rows and I do not want to load them , my data starts at 1 where | is my delimiter

number of header lines is not fixed ,above are 3 but it can be more than that at times. 

 I want to filter the headers 

easy way to filter is find rows which do not have | and remove them .

my approach is file reader--> csv parser--> filter 

in my csv parser I have skip lines as 0 contain header as unchecked.  delimiter as | 

my csv parser does not like the header as 0 and give me error as "The number of CSV data fields is greater than the number of column names in the property: number of CSV data fields = 50, number of header columns = 1'

the output my file reader is binary so I can not put filter there , what should be the correct approach. 

 

  • Hello suresh_fm,

    Please find the attached pipeline with the sample file that I've used to achieve the wanted parsing of the txt file.

    Let me know if this helps you.

    Regards,

    Aleksandar.

1 Reply

  • Hello suresh_fm,

    Please find the attached pipeline with the sample file that I've used to achieve the wanted parsing of the txt file.

    Let me know if this helps you.

    Regards,

    Aleksandar.