cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to write salesforce data to s3 in parquet format

Roger667
New Contributor III

I Using saleforce read snap to read data and write to s3 in parquet format but getting error due to 'picklist' data type which is not compatible to parquet. How to handle such datat types problem

1 ACCEPTED SOLUTION

Hi @Roger667 

Could you please check your SF read snap and make sure the 'Match datatype' is checked.

manichandana_ch_0-1696409757505.png

Thanks !

View solution in original post

12 REPLIES 12

HI @manichandana_ch 
Thanks for solving the buffer issue. This Worked but this led to another error. The parquet is not able to write boolean data types even though i have it in the excel file. It is identifying Boolean columns as string even tough i  can see in the metadata that those columns are not string but boolean

Roger667_0-1696402637078.png

Hi @Roger667 

Could you please check your SF read snap and make sure the 'Match datatype' is checked.

manichandana_ch_0-1696409757505.png

Thanks !

Roger667
New Contributor III

Thank @manichandana_ch . This solution worked for me. I have a question that may extend beyond the current topic. How should I design the pipeline to accommodate incremental loads?

Hi @Roger667 

After getting the data from SF read snap, you can probably take a copy and get the max value of lastmodified date or any audit date column and store it in a file or config table if any. Then, before the SF read, you can read the file/table and get the lastmodifieddate value. Then, in the SF read snap you can add a filter to get the records greater than the lastmodified date. 

manichandana_ch_0-1696414391241.png

This is the expression added in where clause of SF read, where $lastmodifieddate is coming from file/table  before SF read snap. In case you prefer using file, you can store the file in local sldb or any location that is available to you.

Hope this helps !

Thanks!

 

 

Roger667
New Contributor III

Thanks @manichandana_ch . Everything is running smoothly, except there's a minor, peculiar issue with the mapping CSV file. After the left join lookup mapper i am getting null values even though the data type and there corrersponding convert data type is present in the excel file. I moved those records from bottom to top of the excel sheet and then it worked fine. Any thoughts on how this is happening? And when i moved several records at the top i got another issue at parquet writer

Roger667_0-1696417590008.png