如何让 MySQL 正确处理 UTF-8

2021-11-20 00:00:00 utf-8 mysql

对我昨天问的一个问题的回应之一 建议我应该确保我的数据库可以正确处理 UTF-8 字符.我如何用 MySQL 做到这一点?

One of the responses to a question I asked yesterday suggested that I should make sure my database can handle UTF-8 characters correctly. How I can do this with MySQL?

推荐答案

更新:

简短回答 - 您应该几乎总是使用 utf8mb4 字符集和 utf8mb4_unicode_ci 归类.

Short answer - You should almost always be using the utf8mb4 charset and utf8mb4_unicode_ci collation.

修改数据库:

ALTER DATABASE dbname CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

见:

  • Aaron 对此答案的评论 如何制作MySQL正确处理UTF-8

两者有什么区别utf8_general_ci 和 utf8_unicode_ci

转换指南:https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-conversion.html

原答案:

MySQL 4.1 及更高版本的默认字符集为 UTF-8.你可以在你的 my.cnf 文件中验证这一点,记得设置 both 客户端和服务器(default-character-setcharacter-set-server).

MySQL 4.1 and above has a default character set of UTF-8. You can verify this in your my.cnf file, remember to set both client and server (default-character-set and character-set-server).

如果您有要转换为 UTF-8 的现有数据,请转储您的数据库,然后将其作为 UTF-8 重新导入,确保:

If you have existing data that you wish to convert to UTF-8, dump your database, and import it back as UTF-8 making sure:

  • 在查询/插入数据库之前使用SET NAMES utf8
  • 在创建新表时使用DEFAULT CHARSET=utf8
  • 此时您的 MySQL 客户端和服务器应该是 UTF-8(参见 my.cnf).请记住,您使用的任何语言(例如 PHP)也必须是 UTF-8.某些版本的 PHP 将使用自己的 MySQL 客户端库,可能不支持 UTF-8.
  • use SET NAMES utf8 before you query/insert into the database
  • use DEFAULT CHARSET=utf8 when creating new tables
  • at this point your MySQL client and server should be in UTF-8 (see my.cnf). remember any languages you use (such as PHP) must be UTF-8 as well. Some versions of PHP will use their own MySQL client library, which may not be UTF-8 aware.

如果您确实要迁移现有数据,请记住先备份!当事情没有按计划进行时,可能会发生许多奇怪的数据截断!

If you do want to migrate existing data remember to backup first! Lots of weird choping of data can happen when things don't go as planned!

一些资源:

  • 完整的 UTF-8 迁移 (cdbaby.com)
  • 关于 php 函数的 UTF-8 就绪性 的文章(注意其中的一些信息已过时)
  • complete UTF-8 migration (cdbaby.com)
  • article on UTF-8 readiness of php functions (note some of this information is outdated)

相关文章