String text in JSON output incorrectly using csv parser/formatter to csv
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-16-2023 07:45 AM
I’ve got the following text in a description field in my raw JSON file, where the double quotes are obviously escaped. How can I change settings in the JSON formatter and/or the CSV parser/formatter to fix this problem? The snaps are at their default settings.
JSON:
“Charts the development of Black Art in the United States and Great Britain from 1945 to the present through Paul Gilroy’s notion of the \“Black Atlantic\” – the ways that a distinct aesthetic emerged from a fusion of West African, American, and British traditions. Also listed as ART 294.”,
But when output to csv, it looks like this:
colA: Charts the development of Black Art in the United States and Great Britain from 1945 to the present through Paul Gilroy’s notion of the \Black Atlantic\" – the ways that a distinct aesthetic emerged from a fusion of West African
colB: American
colC: and British traditions. Also listed as ART 294."
So, it appears that the commas in the text are causing it to split into multiple columns and the escaped double quotes don’t end up as just double quotes. In other words, the default setting of '' in the CSV formatter doesn’t appear to be working to resolve text fields like the one above.
I found this on a site, which states that if a backslash is immediately followed by a double quote (\") that the csvformat function won’t work: [CSV-200] CSVFormat cannot read its own output if input contain escape character followed by quote c.... Not sure if this is the same function that your csv formatter snap is using?
UPDATE
The recommendation on the site above was to remove either the " or the \ and I removed the \ and it fixed my text problem!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2023 12:46 PM
So, the site I found fixed the issue with my text for the description field getting spread incorrectly across three columns, but now I have roughly 250 rows out of about 7000 rows where the description is messed up, with characters like apostrophes displayed incorrectly on the csv output.
Issues:
- Apostrophe’s are incorrectly output, example: women’s becomes women’s
- “bad” and “wild” becomes “bad†and “wildâ€
- An em-dash in a text like P—GRK becomes P—GRK
Can someone help me with this?
New Note: I discovered if I open my csv in notepad++ the characters all appear to be correct and not messed up as above. So, I really don’t know what to do now.
Can someone please help me with this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2023 04:46 PM
Hi @vincenr
Good day, this are extended characters/ascii code which maybe be viewed in excel or other app as special character
So, what do you want to do with this characters?
Here are some options that you can do (1) replace the extended characters with the regular ones or (2) just remove all those characters
$the_text_description.replace(/”|“|‘|’|–|—/g, m=> match m { '”'=> '"', '“'=> '"', "‘"=> "'", "’"=> "'", "—"=> "-", "–"=> "-" })
or you can also totally removed those characters
$the_text_description.replace(/”|“|‘|’|–|—/g, '')
Thanks,
MM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2023 05:04 PM
OK, I will try that period. Thank but please leave this ticket open until after I verify if it works.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2023 09:01 AM
I’d rather see the syntax for how to only include a-zA-z0-9.()’