LIBXML_NOENT 做什么(为什么不叫 LIBXML_ENT)?
在 PHP 中,可以将可选参数传递给各种 XML 解析器,其中之一是 LIBXML_NOENT
.documentation 有这样的说法:
In PHP, one can pass optional arguments to various XML parsers, one of them being LIBXML_NOENT
. The documentation has this to say about it:
LIBXML_NOENT(整数)
替代实体
LIBXML_NOENT (integer)
Substitute entities
替代实体
信息量不是很大(什么实体?它们什么时候被替代?).但我认为假设 NOENT
是 NO_ENTITIES
或 NO_EXTERNAL_ENTITIES
的缩写是公平的,所以对我来说,这个标志似乎是一个公平的假设禁用(外部)实体的解析.
Substitute entities
isn't very informative (what entities? when are they substituted?). But I think it's fair to assume that NOENT
is short for NO_ENTITIES
or NO_EXTERNAL_ENTITIES
, so to me it seems to be a fair assumption that this flag disables the parsing of (external) entities.
但确实不是这样的:
$xml = '<!DOCTYPE root [<!ENTITY c PUBLIC "bar" "/etc/passwd">]>
<test>&c;</test>';
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT);
echo $dom->textContent;
结果是回显了/etc/passwd 的内容.如果没有 LIBXML_NOENT
参数,情况并非如此.
The result is that the content of /etc/passwd is echoed. Without the LIBXML_NOENT
argument this is not the case.
对于非外部实体,该标志似乎没有任何作用.示例:
For non-external entities, the flag doesn't seem to have any effect. Example:
$xml = '<!DOCTYPE root [<!ENTITY c "TEST">]>
<test>&c;</test>';
$dom = new DOMDocument();
$dom->loadXML($xml);
echo $dom->textContent;
这段代码的结果是TEST",有没有LIBXML_NOENT
.
The result of this code is "TEST", with and without LIBXML_NOENT
.
该标志似乎对 <
等预定义实体没有任何影响.
The flag doesn't seem to have any effect on pre-defined entities such as <
.
所以我的问题是:
LIBXML_NOENT
标志到底有什么作用?- 为什么叫
LIBXML_NOENT
?它的缩写是什么,LIBXML_ENT
或LIBXML_PARSE_EXTERNAL_ENTITIES
不是更合适吗? - 是否存在实际上阻止解析所有实体的标志?
- What exactly does the
LIBXML_NOENT
flag do? - Why is it called
LIBXML_NOENT
? What is it short for, and wouldn'tLIBXML_ENT
orLIBXML_PARSE_EXTERNAL_ENTITIES
be a better fit? - Is there a flag that actually prevents the parsing of all entities?
推荐答案
问:LIBXML_NOENT 标志具体有什么作用?
该标志允许替换 XML 字符实体引用,无论是否外部.
The flag enables the substitution of XML character entity references, external or not.
问:为什么叫LIBXML_NOENT?它的缩写是什么,LIBXML_ENT 或 LIBXML_PARSE_EXTERNAL_ENTITIES 不是更合适吗?
这个名字确实具有误导性.我认为 NOENT
只是意味着解析文档的节点树不会包含任何实体节点,因此解析器将替换实体.如果没有 NOENT
,解析器会为实体创建 DOMEntityReference 节点参考文献.
The name is indeed misleading. I think that NOENT
simply means that the node tree of the parsed document won't contain any entity nodes, so the parser will substitute entities. Without NOENT
, the parser creates DOMEntityReference nodes for entity references.
问:是否存在实际上阻止解析所有实体的标志?
LIBXML_NOENT
启用所有实体引用的替换.如果您不想扩展实体,只需省略该标志即可.例如
LIBXML_NOENT
enables the substitution of all entity references. If you don't want entities to be expanded, simply omit the flag. For example
$xml = '<!DOCTYPE test [<!ENTITY c "TEST">]>
<test>&c;</test>';
$dom = new DOMDocument();
$dom->loadXML($xml);
echo $dom->saveXML();
打印
<?xml version="1.0"?>
<!DOCTYPE test [
<!ENTITY c "TEST">
]>
<test>&c;</test>
似乎 textContent
会自行替换实体,这可能是 PHP 绑定的一个特性.如果没有 LIBXML_NOENT
,它会导致内部和外部实体的行为不同,因为后者不会被加载.
It seems that textContent
replaces entities on its own which might be a peculiarity of the PHP bindings. Without LIBXML_NOENT
, it leads to different behavior for internal and external entities because the latter won't be loaded.
相关文章