PHP:用 UTF-8 字符串中最接近的 7 位 ASCII 等价物替换变音符号

2021-12-28 00:00:00 utf-8 php diacritics strtr

我想要做的是从字符串中删除所有重音和变音符号,将lärm"变成larm"或andré"变成andre".我试图做的是对字符串进行 utf8_decode,然后在其上使用 strtr,但由于我的源文件保存为 UTF-8 文件,我无法为所有变音符号输入 ISO-8859-15 字符 - 编辑器插入UTF-8 字符.

What I want to do is to remove all accents and umlauts from a string, turning "lärm" into "larm" or "andré" into "andre". What I tried to do was to utf8_decode the string and then use strtr on it, but since my source file is saved as UTF-8 file, I can't enter the ISO-8859-15 characters for all umlauts - the editor inserts the UTF-8 characters.

显然,对此的解决方案是拥有一个 ISO-8859-15 文件的包含,但必须有比拥有另一个必需的包含更好的方法吗?

Obviously a solution for this would be to have an include that's an ISO-8859-15 file, but there must be a better way than to have another required include?

echo strtr(utf8_decode($input), 
           'ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ',
           'SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy');

更新:也许我对我尝试做的事情有点不准确:我实际上并不想删除变音符号,而是用它们最接近的一个字符 ASCII"等价物替换它们.

UPDATE: Maybe I was a bit inaccurate with what I try to do: I do not actually want to remove the umlauts, but to replace them with their closest "one character ASCII" equivalent.

推荐答案

iconv("utf-8","ascii//TRANSLIT",$input);

扩展示例

相关文章