“不正确的字符串值"尝试通过 JDBC 将 UTF-8 插入 MySQL 时?

2021-11-20 00:00:00 utf-8 mysql utf8mb4 jdbc

这是我的连接设置方式:
Connection conn = DriverManager.getConnection(url + dbName + "?useUnicode=true&characterEncoding=utf-8", userName, password);

This is how my connection is set:
Connection conn = DriverManager.getConnection(url + dbName + "?useUnicode=true&characterEncoding=utf-8", userName, password);

我在向表中添加行时遇到以下错误:
不正确的字符串值:'\xF0\x90\x8D\x83\xF0\x90...' 对于第 1 行的 'content' 列

And I'm getting the following error when tyring to add a row to a table:
Incorrect string value: '\xF0\x90\x8D\x83\xF0\x90...' for column 'content' at row 1

我正在插入数千条记录,当文本包含 \xF0 时我总是收到此错误(即错误的字符串值总是以 \xF0 开头).

I'm inserting thousands of records, and I always get this error when the text contains \xF0 (i.e. the the incorrect string value always starts with \xF0).

该列的排序规则是 utf8_general_ci.

The column's collation is utf8_general_ci.

可能是什么问题?

推荐答案

MySQL 的 utf8 只允许 UTF-8 中可以用 3 个字节表示的 Unicode 字符.这里有一个需要 4 个字节的字符:\xF0\x90\x8D\x83 (U+10343 哥特式字母 SAUIL).

MySQL's utf8 permits only the Unicode characters that can be represented with 3 bytes in UTF-8. Here you have a character that needs 4 bytes: \xF0\x90\x8D\x83 (U+10343 GOTHIC LETTER SAUIL).

如果您有 MySQL 5.5 或更高版本,您可以将列编码从 utf8 更改为 utf8mb4.这种编码允许在 UTF-8 中存储占用 4 个字节的字符.

If you have MySQL 5.5 or later you can change the column encoding from utf8 to utf8mb4. This encoding allows storage of characters that occupy 4 bytes in UTF-8.

您可能还需要在 MySQL 配置文件中将服务器属性 character_set_server 设置为 utf8mb4.似乎 Connector/J 默认为 3 字节Unicode 否则:

You may also have to set the server property character_set_server to utf8mb4 in the MySQL configuration file. It seems that Connector/J defaults to 3-byte Unicode otherwise:

例如,要在 Connector/J 中使用 4 字节 UTF-8 字符集,请使用 character_set_server=utf8mb4 配置 MySQL 服务器,并将 characterEncoding 排除在连接器/J 连接字符串.然后连接器/J 将自动检测 UTF-8 设置.

For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server=utf8mb4, and leave characterEncoding out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.

相关文章