MySQL Insert Can't Insert Special Characters

Is there any way to encode strings or perform a MySQL insert on a special character like ł or Ł. I can’t get it working, but it works just fine using another ETL process.

In my specific use case I’m trying to insert names that include special characters. They appear in MySQL as “?” instead of the special character.

Any help is appreciated!

Does that other ETL process work with the same database(what Oracle users might call a schema)?

https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html

The problem here is that unicode has a few problems, so it can’t be supported 100%, and everything and everyone support it a bit differently. The default is often ASCII, and if you want to support unicode:

  1. EVERYTHING must support it all the way… The input, processing, and output, as well as source and target.
  2. In some cases, like older Oracle installations, the operating system environment containing such instances must be setup right.
  3. In some cases, like some Microsoft installations, you must use the exact code page sometimes.
  4. With GRAPHIC characters, especially if they are the early microsoft/IBM ones, all bets might be off, and you may have to settle.

One good thing though, these are unicode, so it SHOULD be doable:

From wikipedia…
“The Unicode codepoints for the letter are U+0142 for the lower case, and U+0141 for the capital.[11] In the LaTeX typesetting system Ł and ł may be typeset with the commands \L{} and \l{}, respectively. The HTML-codes are Ł and ł for Ł and ł, respectively.”

Your first task should be to run the process with a few of the offending records in preview mode. Check the target(just before the converter in the preview window). If it is right, the problem is with the target driver, or some target centric area. Check the source. If it is bad, the problem is with the source driver, or some source centric area. THEY are usually the culprit. Then maybe try key points in between to find one that fails, and work back from there.

OH, and if you are using an oracle version prior to 12(I forget the EXACT version in 12 where they fixed it), and you find the problem is THERE, it might be an ENVIRONMENT problem. You will have to fix the environment(it is just a couple environment variable settings)

https://docs.oracle.com/cd/E12102_01/books/AnyInstAdm784/AnyInstAdmPreInstall18.html

and restart the drivers and/or snaplex.

We’re using Aurora, so we’re communicating using MySQL libs on UTF-8 defined InnoDB tables.

Under the hood, we’ve drilled down into how to test this:

  1. JSON generator does not output correctly if you copy paste the value
  2. Inserting the value manually into the UTF-8 table and running a select against it successfully loads the character into Snaplogic (it is properly previewable)
  3. Selecting that value and then inserting it back into the table fails, as the value inputted is just a ? instead of the proper character

So, under the hood, it seems like there’s a problem with either the insert or merge steps, or both.

We’ve considered bypassing this by creating an S3 file that we upload, but this feels like overkill for what should be an otherwise simple insert.

We’ve found a good workaround for this issue thanks to a recent MySQL Snap release that hasn’t even made it into the release notes yet.

The MySQL Bulk load snap allows you to define the working character encoding set, which successfully inserts UTF8 encoded strings into a table. So, it looks like Insert and Merge could use the same encoding flag change that Bulk Load supports, if possible.

Did you try to set properties for the mysql account with
useUnicode=yes
characterEncoding=utf8
It works for our Aurora instance.

1 Like

THANK YOU jaybodra!!! Turns out our workaround broke in MySQL 5.7, and we stumbled upon our own question which now has your solution that worked. A bunch of employees with special characters in their names will be very grateful!