如何使用 PHP 的各种 XML 库来获得类似 DOM 的功能并避免 DoS 漏洞,例如 Billion Laughs 或 Quadratic Blowup?
我正在编写一个在 PHP 中具有 XML API 的 Web 应用程序,我担心三个特定的漏洞,它们都与内联 DOCTYPE 定义有关:本地文件包含、二次实体爆炸和指数实体爆炸.我很想使用 PHP (5.3) 的内置库,但我想确保我不会受到这些影响.
I'm writing a web application that has an XML API in PHP, and I'm worried about three specific vulnerabilities, all related to inline DOCTYPE definitions: local file inclusion, quadratic entity blowup, and exponential entity blowup. I'd love to use PHP's (5.3) built in libraries, but I want to make sure I'm not susceptible to these.
我发现我可以使用 libxml_disable_entity_loader 消除 LFI,但这对内联 ENTITY 声明没有帮助,包括引用其他实体的实体.
I found I can eliminate LFI with libxml_disable_entity_loader, but this doesn't help with inline ENTITY declarations, including entities that refer to other entities.
SimpleXML 库(SimpleXMLElement、simplexml_load_string 等)非常棒,因为它是一个 DOM 解析器,而且我的所有输入都很小;它让我可以很容易地使用 xpath 和操作 DOM.我不知道如何停止实体声明.(如果可能,我很乐意禁用所有内联 DOCTYPE 定义.)
The SimpleXML library (SimpleXMLElement, simplexml_load_string, etc) is great because it's a DOM parser and all my inputs are fairly small; it allows me to use xpath and manipulate the DOM pretty easily. I can't figure how to stop ENTITY declarations. (I would be happy to disable all inline DOCTYPE definitions, if possible.)
XML Parser 库(xml_parser_create、xml_set_element_handler 等)允许我使用 xml_set_default_handler 设置包含实体的默认处理程序.我可以破解它,因此对于无法识别的实体,它只返回原始字符串(即&ent;").不过这个库令人沮丧:因为它是一个 SAX 解析器,我必须编写一堆处理程序(多达 9 个..).
The XML Parser library (xml_parser_create, xml_set_element_handler, etc) allows me to set the default handler, which includes entities, with xml_set_default_handler. I can hack it so for unrecognized entities it simply returns the original string (ie, "&ent;"). This library is frustrating though: because it is a SAX parser I have to write a bunch of handlers (as many as 9..).
那么是否有可能使用内置库,取出类似 DOM 的对象,并保护自己免受这些各种 DoS 漏洞的侵害?谢谢
So is it possible to use the built in libraries, get DOM-like objects out, and protect myself from these various DoS vulnerabilities? thanks
此页面描述了三个漏洞,并提供了解决方案...如果我使用的是 .NET:http://msdn.microsoft.com/en-us/magazine/ee335713.aspx
This page describes the three vulnerabilities, and provides a solution...if only I were using .NET: http://msdn.microsoft.com/en-us/magazine/ee335713.aspx
更新:
<?php
$s = <<<EOF
<?xml version="1.0?>
<!DOCTYPE data [
<!ENTITY en "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa....">
]>
<data>&en;&en;&en;&en;&en;&en;&en;&en;&en;&en;&en;&en;.....</data>
EOF;
$doc = new DOMDocument();
$doc->loadXML($s);
var_dump($d->lastChild->nodeValue);
?>
我也试过 loadXML($s, LIBXML_NOENT);
.在这两种情况下,我最终都倾倒了 300+ MB.我还缺少什么吗?
I tried loadXML($s, LIBXML_NOENT);
as well. In both cases I end up dumping 300+ MB. Is there something I'm still missing?
推荐答案
注意:如果您使用包含以下 XML 块的文件创建测试用例,预计编辑器也可能容易受到这些攻击,并且可能会冻结/崩溃.
Note: If you create test-cases with files that contain the XML chunks in the following, expect that editors might be prone to these attacks as well and might freeze/crash.
十亿笑
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
加载时:
致命:#89:检测到实体引用循环 1:7
...(加上六倍相同 = 七倍以上)
致命:#89:检测到实体引用循环 14:13
FATAL: #89: Detected an entity reference loop 1:7
... (plus six times the same = seven times total with above)
FATAL: #89: Detected an entity reference loop 14:13
结果:
<?xml version="1.0"?>
内存使用量少,DOMDocument
未触及峰值.由于此示例显示了 7 个致命错误,因此可以得出结论,并且确实如此加载时没有错误:
Memory usage is light, the peak not touched by DOMDocument
. As this example shows 7 fatal errors, one can conclude and indeed it is so that this loads w/o errors:
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
]>
<lolz>&lol2;</lolz>
由于实体替换没有生效并且这项工作,让我们尝试一下
As entity substitution is not in effect and this work, let's try with
这就是这里,为了您的观看乐趣而缩短了(我的变体约为 27/11kb):
That is this one here, shortened for your viewing pleasure (my variants are about 27/11kb):
<?xml version="1.0"?>
<!DOCTYPE kaboom [
<!ENTITY a "aaaaaaaaaaaaaaaaaa...">
]>
<kaboom>&a;&a;&a;&a;&a;&a;&a;&a;&a;...</kaboom>
如果你使用 $doc->loadXML($src, LIBXML_NOENT);
这确实是一种攻击,当我写这个时,脚本仍在加载....所以这实际上需要一些时间来加载和消耗内存.你可以自己玩的东西.没有 LIBXML_NOENT
它可以完美快速地运行.
If you use $doc->loadXML($src, LIBXML_NOENT);
this does work as an attack, while I write this, the script is still loading ... . So this actually takes some time to load and consumes memory. Something you can play with your own. W/o LIBXML_NOENT
it works flawlessly and fast.
但有一点需要注意,例如,如果您获取标签的 nodeValue
,即使您不使用该加载标志,也会使实体展开.
But there is a caveat, if you obtain the nodeValue
of a tag for example, you will get the entities expanded even if you don't use that loading flag.
解决此问题的方法是从文档中删除 DocumentType 节点.请注意以下代码:
$doc = new DOMDocument();
$doc->loadXML($s); // where $s is a Quadratic attack xml string above.
// now remove the doctype node
foreach ($doc->childNodes as $child) {
if ($child->nodeType===XML_DOCUMENT_TYPE_NODE) {
$doc->removeChild($child);
break;
}
}
// Now the following is true:
assert($doc->doctype===NULL);
assert($doc->lastChild->nodeValue==='...');
// Note that entities remain unexpanded in the output XML
// This is not so good since this makes the XML invalid.
// Better is a manual walk through all nodes looking for XML_ENTITY_NODE
assert($doc->saveXML()==="<?xml version="1.0"?>
<kaboom>&a;&a;&a;&a;&a;&a;&a;&a;&a;...</kaboom>
");
// however, canonicalization will produce warnings because it must resolve entities
assert($doc->C14N()===False);
// Warning will be like:
// PHP Warning: DOMNode::C14N(): Node XML_ENTITY_REF_NODE is invalid here
因此,虽然这种解决方法可以防止 XML 文档在 DoS 中消耗资源,但它很容易生成无效的 XML.
So while this workaround will prevent an XML document from consuming resources in a DoS, it makes it easy to generate invalid XML.
一些数字(我减小了文件大小,否则需要很长时间)(代码):
Some figures (I reduced the file-size otherwise it takes too long) (code):
LIBXML_NOENT disabled LIBXML_NOENT enabled
Mem: 356 184 (Peak: 435 464) Mem: 356 280 (Peak: 435 464)
Loaded file quadratic-blowup-2.xml into string. Loaded file quadratic-blowup-2.xml into string.
Mem: 368 400 (Peak: 435 464) Mem: 368 496 (Peak: 435 464)
DOMDocument loaded XML 11 881 bytes in 0.001368 secs. DOMDocument loaded XML 11 881 bytes in 15.993627 secs.
Mem: 369 088 (Peak: 435 464) Mem: 369 184 (Peak: 435 464)
Removed load string. Removed load string.
Mem: 357 112 (Peak: 435 464) Mem: 357 208 (Peak: 435 464)
Got XML (saveXML()), length: 11 880 Got XML (saveXML()), length: 11 165 132
Got Text (nodeValue), length: 11 160 314; 11.060893 secs. Got Text (nodeValue), length: 11 160 314; 0.025360 secs.
Mem: 11 517 776 (Peak: 11 532 016) Mem: 11 517 872 (Peak: 22 685 360)
到目前为止,我还没有决定保护策略,但现在知道将十亿笑声加载到 PHPStorm 会冻结它例如,我停止测试后者,因为我不想在写这篇文章时冻结它.
I have not made up my mind so far about protection strategies but now know that loading the billion laugh into PHPStorm will freeze it for example and I stopped testing the later as I didn't wanted to freeze it while writing this.
相关文章