UTF-8:在数据库中正确显示,但不是在 HTML 中,尽管 utf-8 字符集

2022-01-07 00:00:00 utf-8 character-encoding mysql html

我使用 MySQL 5.1 并从大约 270 万行的 UTF-8 解码 txt 文件加载到一个表中,该表本身声明为 utf8_unicode_ci 并且所有字符字段都声明为 utf8_unicode_ci,使用LOAD DATA INFILE...

I use MySQL 5.1 and loaded from a UTF-8 decoded txt-file about 2.7 mil lines into a table which itself is declared as utf8_unicode_ci and as well all char-fields are declared as utf8_unicode_ci, using LOAD DATA INFILE...

在数据库本身中,字符似乎都是正确的,一切看起来都不错.但是,当我使用 php 打印它们时,字符显示为 ???,尽管我在 HTML 头中使用了 utf-8 声明:

In the database itself the characters all seem to be correct, everything looks nice. However, when I print them using php, the characters show up as ???, although I use utf-8 declaration in the HTML head:

<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
...

在另一个表(使用 utf-8)中,我从提交的表单插入文本,字符在数据库中出现奇怪,但当我使用 SELECT... 打印它们时再次正确显示代码>.

In another table (using utf-8), where I inserted text from a submitted form, the characters appear strangely in the database, but are shown correctly again, when I print them using SELECT....

所以,我想知道:出了什么问题?UTF-8 字符是否在数据库中正确显示或奇怪地显示,但是当您再次 SELECT 它们时,它们是否正常?或者问题出在哪里(将文件加载到数据库中时,在 HTML 中或介于两者之间)??

So, I was wondering: what is wrong? Are UTF-8 chars shown correctly in the database or strangely but when you SELECT them again they are OK? Or where is the problem (when loading the file into the db, in the HTML or somewhere in between)??

非常感谢您的任何提示或建议!:)

Thank you very much for any hint or suggestion! :)

推荐答案

注意:MySQL 的 utf8 字符集是有限的,它只支持 BMP 中不超过三个字节的 Unicode 字符.您应该改用 utf8mb4.

Note: MySQL's utf8 charset is limited, it only supports Unicode characters in the BMP that take up no more than three bytes. You should be using utf8mb4 instead.

  • 确保在连接后将 SET NAMES utf8 SET NAMES utf8mb4 命令发送到 MySQL,然后再运行任何 MySQL 查询.
  • 确保您的页面实际呈现为 utf-8(如果有 HTTP 标头 Content-Type: text/html;charset=iso-8859-1,浏览器不同意哪个应该获胜).
  • 阅读这篇文章:在 Web 应用程序中从前到后处理 Unicode(但请记住替换 utf8utf8mb4 其中 MySQL 是相关的.
  • Make sure you send the SET NAMES utf8 SET NAMES utf8mb4 command to MySQL after connecting, before running any MySQL queries.
  • Make sure your page is actually rendered as utf-8 (if there's an HTTP header Content-Type: text/html;charset=iso-8859-1, browsers disagree about which should win).
  • Read this article: Handling Unicode Front To Back In A Web App (but remember to replace utf8 with utf8mb4 where MySQL is concerned).

如果 phpMyAdmin 将您输入的数据显示为正确的 Unicode 文本,那么我敢打赌,您在连接后没有执行 SET NAMES utf8.

If phpMyAdmin displays your entered data as correct Unicode text, then my bet is that you are not doing SET NAMES utf8 after connecting.

相关文章