โ06-28-2021 09:20 PM
Hi team,
Looking for a suggestion/enhancement to achieve the below senario. when i read a csv file (it might have \n\r (windows) & \n(unix) ). Who to fix this โฆ
Today I read the file and when I do the bulk load using redshift bulk load snap and the data gets loaded with the \n\r . How can I escape these characters?
When i look at the properties of the snap , i could see the below. but itโs not working.
Solved! Go to Solution.
โ07-01-2021 02:38 AM
so โฆ this gave me the solution
$content.contains(โ\r\nโ)?
($content.replaceAll(โ\r\nโ,โ\nโ) && $content.replaceAll(โ\nโ,โโ) ) : $content.replaceAll(โ\nโ,โ\nโ)
โ06-30-2021 04:39 PM
Here is a community post about stripping line endings; maybe this could be used to normalize embedded EOL characters in your data, such as \r\n -> \n
.
I can research Redshift Bulk Loadโs additional options โฆ The COPY command has quite a few options, I see a CSV data format parameter too.
โ06-30-2021 11:33 PM
Well, thanks for coming back @mbowen โฆ my target is redshift โฆ I will be getting the file from the win / Unix system and blow will be its behaviour
Win
Unix
I am trying to read the csv file irrespective of the system
i am using the blow code for replacing the \r\n โ ($content.replaceAll(โ\r\nโ,โ\nโ) && $content.replaceAll(โ\nโ,โโ) )
Problem - when I pass the Unix file it replaces the \n to โโ and makes the csv invalid.
how to accommodate them both
โ07-01-2021 02:38 AM
so โฆ this gave me the solution
$content.contains(โ\r\nโ)?
($content.replaceAll(โ\r\nโ,โ\nโ) && $content.replaceAll(โ\nโ,โโ) ) : $content.replaceAll(โ\nโ,โ\nโ)
โ07-01-2021 05:57 AM
@makeshwar - be cautious using the Binary to Document snap - it will place the entire input file into memory. This is fine if you are dealing with MB size files, but if you get into multiple GB, it is possible to consume a majority of your JVM memory on the snaplex node.
Is it possible to convert the Windows files to Unix before writing to S3? Or have them dropped to a different location so they can be translated to UNIX carriage control prior to loading into Redshift?