如何转换 HTML 实体,如 –到他们的性格等价物?
我正在创建一个文件,该文件将保存在本地用户的计算机上(不在网络浏览器中呈现).
I am creating a file that is to be saved on a local user's computer (not rendered in a web browser).
我目前正在使用 html_entity_decode
,但这并没有转换像 –
(即 n-dash)这样的字符,并且想知道还有什么其他功能我应该在用.
I am currently using html_entity_decode
, but this isn't converting characters like –
(which is the n-dash) and was wondering what other function I should be using.
例如,当文件被导入软件时,它会显示为 –
,而不是 ndash 或只是一个 -.我知道我可以使用 str_replace
,但是如果它发生在这个字符上,它可能会发生在许多其他字符上,因为数据是动态的.
For example, when the file is imported into the software, instead of the ndash or just a - it shows up as –
. I know I could use str_replace
, but if it's happening with this character, it could happen with many others since the data is dynamic.
推荐答案
您需要定义目标字符集.–
不是默认 ISO-8859-1 字符集中的有效字符,因此不会被解码.将 UTF-8 定义为输出字符集,它将解码:
You need to define the target character set. –
is not a valid character in the default ISO-8859-1 character set, so it's not decoded. Define UTF-8 as the output charset and it will decode:
echo html_entity_decode('–', ENT_NOQUOTES, 'UTF-8');
如果可能,您应该避免使用 HTML 实体.我不知道编码数据来自哪里,但是如果您像这样将其存储在数据库或其他地方,那么您就做错了.始终存储 UTF-8 编码的数据,并且仅在必要时转换为 HTML 实体或以其他方式转义输出.
If at all possible, you should avoid HTML entities to begin with. I don't know where that encoded data comes from, but if you're storing it like this in the database or elsewhere, you're doing it wrong. Always store data UTF-8 encoded and only convert to HTML entities or otherwise escape for output when necessary.
相关文章