Something switches MySQL database table charset under my feet.
I added a few custom tables to an existing database (created by WordPress) declaring their default charset as utf8. Then uploaded a bunch of utf8 mb3 strings (mb3 is default MySQL UTF-8 version) and verified table content by dumping it.
To my surprise, about a week later I found that table data encoding has changed from utf8 mb3 to mb4. Of course, I reloaded data again and verified encoding again. A few days later the encoding switched back to mb4 - by itself !!!
The issue is not in the data - the uploaded text seems to be correctly encoded in mb4. It is the randomly changing encoding that bothers me. Partially, because I could not find a way to make MySQL command-line admin tool (and some old text editors) understand utf8 mb4.
Does anyone have ideas of what is going on and how to work around it?
Thank you in advance
First, I'm no database admin, but I've been looking at some dev resources (https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-p... and (https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434). It seems what you think is encoded UTF8 is actually UTF8MB4 as that's the true default for UTF8 since the actual UTF8 is restricted to three bits per character.
If I understand correctly, UTF8mb4 is the superset of mb3, and would be what is used by default in mysql versions 5.5 and newer. At least it looks that way from what I see in dev mysql documentation referenced within one of our test accounts (https://dev.mysql.com/doc/refman/5.5/en/charset-charsets.html). I also know that certain characters, such as emojis need mb4. If that type of data is being used, the database could be self updating to support the information.
Here is the reference to corresponding MySQL manual page (my server version is 5.6.39):
which says that utf8 is equivalent to utf3mb3.
For my tests I am using utf8 symbols that have 16-bit representation and therefore
are correctly handled within mb3 subset (confirmed this by dumping tables). I did not try to specify utf3mb3 explicitly, though.
Why would a database self update, as you suggested, is a big mystery and concern.
I am biased to think that it is GoDaddy that moves my database from one virtual host to another by dumping and loading table data. At some point in this process my charset gets clobbered and some other defaults override my settings.