PHP 多字节 str_replace?

2021-12-25 00:00:00 string replace php multibyte-functions

我正在尝试在 PHP 中进行重音字符替换,但得到的结果很奇怪,我猜是因为我使用的是 UTF-8 字符串,而 str_replace 无法正确处理多字节字符串..

I'm trying to do accented character replacement in PHP but get funky results, my guess being because i'm using a UTF-8 string and str_replace can't properly handle multi-byte strings..

$accents_search     = array('á','à','â','ã','ª','ä','å','Á','À','Â','Ã','Ä','é','è',
'ê','ë','É','È','Ê','Ë','í','ì','î','ï','Í','Ì','Î','Ï','œ','ò','ó','ô','õ','º','ø',
'Ø','Ó','Ò','Ô','Õ','ú','ù','û','Ú','Ù','Û','ç','Ç','Ñ','ñ'); 

$accents_replace    = array('a','a','a','a','a','a','a','A','A','A','A','A','e','e',
'e','e','E','E','E','E','i','i','i','i','I','I','I','I','oe','o','o','o','o','o','o',
'O','O','O','O','O','u','u','u','U','U','U','c','C','N','n'); 

$str = str_replace($accents_search, $accents_replace, $str);

我得到的结果:

Ørjan Nilsen -> �orjan Nilsen

预期结果:

Ørjan Nilsen -> Orjan Nilsen

我的内部字符处理程序设置为 UTF-8(根据 mb_internal_encoding()),$str 的值也是 UTF-8,所以据我所知,所有涉及的字符串都是 UTF-8.str_replace() 是否检测字符集并正确使用它们?

I've got my internal character handler set to UTF-8 (according to mb_internal_encoding()), also the value of $str is UTF-8, so from what I can tell, all the strings involved are UTF-8. Does str_replace() detect char sets and use them properly?

推荐答案

看起来字符串没有被替换,因为你的输入编码和文件编码不匹配.

Looks like the string was not replaced because your input encoding and the file encoding mismatch.

相关文章