检测PHP中的文件编码

2021-12-28 00:00:00 utf-8 character-encoding php

我有一个脚本将多个文件合并为一个,当其中一个文件具有 UTF8 编码时它会中断.我想我应该在读取文件时使用 utf8_decode() 函数,但我不知道如何判断哪个需要解码.

I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode() function when reading the files, but I don't know how to tell which need decoding.

我的代码基本上是:

$output = '';
foreach ($files as $filename) {
    $output .= file_get_contents($filename) . "
";
}
file_put_contents('combined.txt', $output);

目前,在 UTF8 文件的开头,它会在输出中添加以下字符:

Currently, at the start of a UTF8 file, it adds these characters in the output: 

推荐答案

尝试使用 mb_detect_encoding 函数.此函数将检查您的字符串并尝试猜测"其编码是什么.然后,您可以根据需要对其进行转换.但是,正如 brulak 建议的,您最好转换to UTF-8 而不是 from,以保留您正在传输的数据.

Try using the mb_detect_encoding function. This function will examine your string and attempt to "guess" what its encoding is. You can then convert it as desired. As brulak suggested, however, you're probably better off converting to UTF-8 rather than from, to preserve the data you're transmitting.

相关文章