UTF-8 编码的 html 页面显示 (问号)而不是字符

我在 win7 (x64) 上安装了标准的 XAMPP.在过去的项目中,我遇到了编码问题,其中 mysql 编码与 php 编码不匹配,而 php 编码有时会以其他编码输出 html,我决定使用 utf-8 始终如一地编码所有内容.

I have the standard XAMPP installation on win7 (x64). Having had my share of encoding troubles in a past project where mysql encoding did not match with the php enconding which in turn sometimes output html in other encodings, I decided to consistently encode everything using utf-8.

我刚刚开始使用 html 标记,并且已经遇到了麻烦.

I'm just getting started with the html markup and am allready experiencing troubles.

  • 我的页面是使用 utf-8 保存的(没有 BOM,我认为)
    //更新:事实证明并非如此.该文件实际上是用 ISO_8859-1 保存的.由于 Sherm Pendleys 的回答,我后来发现了这一点.我不得不返回并将我的项目设置(设置为ISO-8859-1")更改为所需的UTF-8".
  • php 设置为每个 .htaccess 以在 utf-8 中提供 .php 页面:AddCharset UTF-8 .php
  • html 有一个元标记指定:<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  • 为了测试我设置了使用的 php header('Content-Type:text/html; charset=UTF-8');
  • My page is saved using utf-8 (no BOM, I think)
    //update: It turns out this was NOT the case. The file was actually saved with ISO_8859-1. I later found this out thanks to Sherm Pendleys answer. I had to go back and change my project settings (which were set to "ISO-8859-1") to the desired "UTF-8".
  • php is set per .htaccess to serve .php-pages in utf-8 with: AddCharset UTF-8 .php
  • html has a meta tag specifying: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  • To test I set used php header('Content-Type:text/html; charset=UTF-8');

页面显然是以 utf-8 提供的(firefox 和 chrome 可以识别它)但是任何特殊字符,例如 éá¡ 只会显示为 .查看源代码时也是如此.

The page is evidently served in utf-8 (firefox and chrome recognize it as such) but any special characters such as é, á or ¡ will just show as . Also when viewing the source code.

当删除上述编码设置时,所有字符都正确呈现,但检测到的编码显示 windows-1252ISO-8859-1 取决于浏览器.

When dropping the encoding settings mentioned above all characters are rendered correctly but the encoding that is detected shows either windows-1252 or ISO-8859-1 depending on the browser.

怎么会?我很困惑.我会期望完全相反的行为.
欢迎任何建议,谢谢!

How come? I'm very puzzled. I would have expected the exact opposite behavior.
Any advice is welcome, thanks!

希望这会有所帮助.这是响应标头(根据萤火虫)

edit: Hopefully this helps a bit more. This is the response header (as per firebug)

HTTP/1.1 200 OK
Date: Sat, 26 Mar 2011 20:49:44 GMT
Server: Apache/2.2.14 (Win32) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l mod_autoindex_color PHP/5.3.1 mod_apreq2-20090110/2.7.1 mod_perl/2.0.4 Perl/v5.10.1
X-Powered-By: PHP/5.3.1
Content-Length: 91
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive
Content-Type: text/html; charset=utf-8

推荐答案

当[删除]编码设置时上面提到的所有字符[呈现]正确但检测到的编码显示windows-1252 或 ISO-8859-1取决于浏览器.

When [dropping] the encoding settings mentioned above all characters [are rendered] correctly but the encoding that is detected shows either windows-1252 or ISO-8859-1 depending on the browser.

那么这就是您真正发送的内容.项目符号列表中的任何编码设置都不会以任何方式实际修改您的输出;它们所做的只是告诉浏览器在解释您发送的内容时采用何种编码.这就是为什么你会得到那些 - 你告诉浏览器你发送的是 UTF-8,但它实际上是 ISO-8859-1.

Then that's what you're really sending. None of the encoding settings in your bullet list will actually modify your output in any way; all they do is tell the browser what encoding to assume when interpreting what you send. That's why you're getting those �s - you're telling the browser that what you're sending is UTF-8, but it's really ISO-8859-1.

相关文章