使用 GD ( imagettftext() ) 和 UTF-8 字符

2021-12-28 00:00:00 image special-characters utf-8 php gd

只是为了记录 - 我在这里的第一个问题,但希望不是我在社区中的最后一个输入.但这不是我在这里的原因.

Just for the record - my first question here but hopefully not my last input in the community. But that's not why I'm here.

我目前正在开发一个简单的系统,该系统必须生成带有文本的图像.一切都很顺利,直到我意识到 GD 无法处理像

I'm currently developing a simple system that has to generate an image with a text on it. Everthing went well until I realised that GD cannot handle UTF-8 characters like

ā, č, ž, ä, ø, é

ā, č, ž, ä, ø, é

等等.

为了解决问题 - 我正在使用 imagettftext()

为了解决我的问题,我深入谷歌搜索并返回了一些解决方案,但遗憾的是,没有一个解决方案完全解决了我的问题.目前我正在使用我在这个线程中找到的这个脚本 - PHP 函数 imagettftext() 和 unicode

Trying to solve my problem I dug into depths of google and some solutions were returned, none of them, sadly, solved my problem completely. Currently I'm using this script I found in this thread - PHP function imagettftext() and unicode

private function properText($text){

    // Convert UTF-8 string to HTML entities
    $text = mb_convert_encoding($text, 'HTML-ENTITIES',"UTF-8");
    // Convert HTML entities into ISO-8859-1
    $text = html_entity_decode($text,ENT_NOQUOTES, "ISO-8859-1");
    // Convert characters > 127 into their hexidecimal equivalents
    $out = "";
    for($i = 0; $i < strlen($text); $i++) {
        $letter = $text[$i];
        $num = ord($letter);
        if($num>127) {
          $out .= "&#$num;";
        } else {
          $out .=  $letter;
        }
    }

    return $out;

}

它适用于某些字符,但不是所有字符,例如,带有变音符号的转换不正确.

and it works fine for some characters but not all of them, for example, a with umlaut isn't converted correctly.

所以在这一点上,我不知道去哪里寻找什么,因为我无法预测用户输入.更准确地说,系统从 xml 提要中提取艺术家姓名并使用数据生成图像(我不打算支持象形文字).

So at this point I'm not sure where and what to look for anymore as I cannot predict the user input. To be more precise, the system is pulling artist names from an xml feed and using the data for the image generation (I'm not planning to support hieroglyphs).

我使用 PHP 的 mb_detect_encoding() 并且我确保当前未正确显示的所有字符都包含在我提供给 imagettftext()<的字体文件中/em>使用 windows charmap 工具检查它的功能.

I've made sure that the data gathered from the feed is indeed UTF-8 by using PHP's mb_detect_encoding() and I've made sure that all the characters that currently aren't displayed correctly are indded in the font file I'm feeding to the imagettftext() function by checking it with windows charmap tool.

希望我能在这里找到答案,并提前感谢您的帮助!

Hopefully I can find my answer here and thank you for your help in advance!

编辑

澄清 - 字符显示不正确,或者更准确地说,被格式错误的字符替换.这是屏幕截图 -

To clarify - the characters are not displayed correctly, or, to be more precise, are replaced by malformed characters. Here is a screenshot -

应该是José González"

it should read "José González"

编辑 No2

对从 xml 提要检索到的数据使用 bin2hex() 函数返回这个.

Using bin2hex() function on data retrieved from the xml feed returns this.

José González -> 4a6f73c3a920476f6e7ac3a16c657a
// input -> bin2hex(input)

编辑 - 已修复

在我继续研究的过程中,我为我的问题找到了答案,这段代码做到了!

As I continued my research I came up with an answer for my problem, this piece of code did it!

$text = mb_convert_encoding($text, "HTML-ENTITIES", "UTF-8");
$text = preg_replace('~^(&([a-zA-Z0-9]);)~',htmlentities('${1}'),$text);
return($text); 

现在所有困扰我的字符都正确显示了!

Now all the characters that troubled me are displayed correctly!

推荐答案

在我继续研究的过程中,我找到了解决问题的答案,这段代码做到了!

As I continued my research I came up with an answer for my problem, this piece of code did it!

private function properText($text){
    $text = mb_convert_encoding($text, "HTML-ENTITIES", "UTF-8");
    $text = preg_replace('~^(&([a-zA-Z0-9]);)~',htmlentities('${1}'),$text);
    return($text); 
}

现在所有困扰我的角色(以及我见过的所有新角色)都正确显示了!

Now all the characters (and all the new ones I've seen) that troubled me are displayed correctly!

相关文章