Parquet Reader & Writer through azure S3

For saving data in parquet format in s3 , below is the pipeline configuration.

This pipeline creates meta data from the data itself , though it uses parquet data type ‘binary’ which is equivalent to string.

the first mapper converts the doc into string. we used the below arrow function into the 1st mapper:
$.mapValues((value, key) => value==null?"":value.toString())

the second mapper function creates the meta from the data. The arrow function used for this:

For reading the file back below is the configuration of parquet reader:


Please note “Use old data format” may be critical otherwise it may fail to read. This is to be checked when data are not nested.

1 Like