当 PHP 无法指示正确的编码时如何加载 XML?
我正在尝试从远程位置加载 XML 源,因此我无法控制格式.不幸的是,我尝试加载的 XML 文件没有编码:
I'm trying to load an XML source from a remote location, so i have no control of the formatting. Unfortunately the XML file I'm trying to load has no encoding:
<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <NODE> </NODE> </ROOT>
在尝试以下操作时:
$doc = new DOMDocument( );
$doc->load(URI);
我明白了:
Input is not proper UTF-8, indicate encoding ! Bytes: 0xA3 0x38 0x2C 0x38
我已经研究了抑制这种情况的方法,但没有运气.我应该如何加载它以便我可以将它与 DOMDocument 一起使用?
Ive looked at ways to suppress this, but no luck. How should I load this so that I can use it with DOMDocument?
推荐答案
您必须将文档转换为 UTF-8,最简单的方法是使用 utf8_encode().
You've to convert your document into UTF-8, the easiest would be to use utf8_encode().
DOM 文档示例:
$doc = new DOMDocument();
$content = utf8_encode(file_get_contents($url));
$doc->loadXML($content);
简单 XML 示例:
$xmlInput = simplexml_load_string(utf8_encode(file_get_contents($url_or_file)));
<小时>
如果您不知道当前的编码,请使用mb_detect_encoding(),例如:
$content = utf8_encode(file_get_contents($url_or_file));
$encoding = mb_detect_encoding($content);
$doc = new DOMdocument();
$res = $doc->loadXML("<?xml encoding='$encoding'>" . $content);
注意事项:
- 如果无法检测到编码(函数将返回 FALSE),您可以尝试通过 utf8_encode().
- 如果您通过
$doc->loadHTML
加载 html 代码,您仍然可以使用 XML 标头.
- If encoding cannot be detected (function will return FALSE), you may try to force the encoding via utf8_encode().
- If you're loading html code via
$doc->loadHTML
instead, you can still use XML header.
如果您知道编码,请使用 iconv() 进行转换:
If you know the encoding, use iconv() to convert it:
$xml = iconv('ISO-8859-1' ,'UTF-8', $xmlInput)
相关文章