04-02-2018 07:13 PM
Is there any way to encode strings or perform a MySQL insert on a special character like ł or Ł. I can’t get it working, but it works just fine using another ETL process.
In my specific use case I’m trying to insert names that include special characters. They appear in MySQL as “?” instead of the special character.
Any help is appreciated!
04-03-2018 09:44 AM
Does that other ETL process work with the same database(what Oracle users might call a schema)?
https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html
The problem here is that unicode has a few problems, so it can’t be supported 100%, and everything and everyone support it a bit differently. The default is often ASCII, and if you want to support unicode:
One good thing though, these are unicode, so it SHOULD be doable:
From wikipedia…
“The Unicode codepoints for the letter are U+0142 for the lower case, and U+0141 for the capital.[11] In the LaTeX typesetting system Ł and ł may be typeset with the commands \L{} and \l{}, respectively. The HTML-codes are Ł and ł for Ł and ł, respectively.”
Your first task should be to run the process with a few of the offending records in preview mode. Check the target(just before the converter in the preview window). If it is right, the problem is with the target driver, or some target centric area. Check the source. If it is bad, the problem is with the source driver, or some source centric area. THEY are usually the culprit. Then maybe try key points in between to find one that fails, and work back from there.
OH, and if you are using an oracle version prior to 12(I forget the EXACT version in 12 where they fixed it), and you find the problem is THERE, it might be an ENVIRONMENT problem. You will have to fix the environment(it is just a couple environment variable settings)
and restart the drivers and/or snaplex.
04-03-2018 10:52 AM
We’re using Aurora, so we’re communicating using MySQL libs on UTF-8 defined InnoDB tables.
Under the hood, we’ve drilled down into how to test this:
So, under the hood, it seems like there’s a problem with either the insert or merge steps, or both.
We’ve considered bypassing this by creating an S3 file that we upload, but this feels like overkill for what should be an otherwise simple insert.
04-03-2018 02:49 PM
We’ve found a good workaround for this issue thanks to a recent MySQL Snap release that hasn’t even made it into the release notes yet.
The MySQL Bulk load snap allows you to define the working character encoding set, which successfully inserts UTF8 encoded strings into a table. So, it looks like Insert and Merge could use the same encoding flag change that Bulk Load supports, if possible.
04-08-2018 12:22 AM
Did you try to set properties for the mysql account with
useUnicode=yes
characterEncoding=utf8
It works for our Aurora instance.