SimpleXML 中的 XPath 用于默认命名空间,无需前缀
我有一个附加了默认命名空间的 XML 文档,例如
I have an XML document that has a default namespace attached to it, eg
<foo xmlns="http://www.example.com/ns/1.0">
...
</foo>
实际上,这是一个符合复杂模式的复杂 XML 文档.我的工作是从中解析出一些数据.为了帮助我,我有一个 XPath 电子表格.XPath 嵌套比较深,例如
In reality this is a complex XML document that conforms to a complex schema. My job is to parse out some data from it. To aid me, I have a spreadsheet of XPath. The XPath is rather deeply nested, eg
level1/level2/level3[@foo="bar"]/level4[@foo="bar"]/level5/level6[2]
生成 XPath 的人是架构方面的专家,所以我假设我无法简化它,或者使用对象遍历快捷方式.
The person who generate the XPath is an expert in the schema, so I am going with the assumption that I can't simplify it, or use object traversal shortcuts.
我正在使用 SimpleXML 来解析所有内容.我的问题与默认命名空间的处理方式有关.
I am using SimpleXML to parse everything out. My problem has to do with how the default namespace gets handled.
由于根元素上有一个默认命名空间,我不能这样做
Since there is a default namespace on the root element, I can't just do
$xml = simplexml_load_file($somepath);
$node = $xml->xpath('level1/level2/level3[@foo="bar"]/level4[@foo="bar"]/level5/level6[2]');
我必须注册命名空间,将其分配给前缀,然后在我的 XPath 中使用前缀,例如
I have to register the namespace, assign it to a prefix, and then use the prefix in my XPath, eg
$xml = simplexml_load_file($somepath);
$xml->registerXPathNamespace('myns', 'http://www.example.com/ns/1.0');
$node = $xml->xpath('myns:level1/myns:level2/myns:level3[@foo="bar"]/myns:level4[@foo="bar"]/myns:level5/myns:level6[2]');
从长远来看,添加前缀是难以管理的.
Adding the prefixes isn't going to be manageable in the long run.
是否有适当的方法来处理默认命名空间而无需在 XPath 中使用前缀?
Is there a proper way to handle default namespaces without needing to using prefixes with XPath?
使用空前缀不起作用($xml->registerXPathNamespace('', 'http://www.example.com/ns/1.0');
).我可以串出默认命名空间,例如
Using an empty prefix doesn't work ($xml->registerXPathNamespace('', 'http://www.example.com/ns/1.0');
). I can string out the default namespace, eg
$xml = file_get_contents($somepath);
$xml = str_replace('xmlns="http://www.example.com/ns/1.0"', '', $xml);
$xml = simplexml_load_string($xml);
但这是在回避问题.
推荐答案
从网上看了一下,这并不局限于任何特定的 PHP 或其他库,而是 XPath 本身——至少在 XPath 1.0 版中
From a bit of reading online, this is not restricted to any particular PHP or other library, but to XPath itself - at least in XPath version 1.0
XPath 1.0 不包含任何默认"命名空间的概念,因此无论元素名称在 XML 源中如何显示,如果它们绑定了命名空间,则它们的选择器必须在基本 XPath 选择器中作为前缀ns:name
的形式.请注意,ns
是在 XPath 处理器中定义的前缀,而不是由正在处理的文档定义,因此与在 XML 表示中如何使用 xmlns
属性无关.
XPath 1.0 does not include any concept of a "default" namespace, so regardless of how the element names appear in the XML source, if they have a namespace bound to them, the selectors for them must be prefixed in basic XPath selectors of the form ns:name
. Note that ns
is a prefix defined within the XPath processor, not by the document being processed, so has no relationship to how xmlns
attributes are used in the XML representation.
参见例如这个常见的 XSLT 错误"页面,谈论密切相关的 XSLT 1.0:
See e.g. this "common XSLT mistakes" page, talking about the closely related XSLT 1.0:
要访问 XPath 中的命名空间元素,您必须为其命名空间定义一个前缀.[...] 不幸的是,XSLT 1.0 版没有类似于默认命名空间的概念.因此,您必须一次又一次地重复命名空间前缀.
To access namespaced elements in XPath, you must define a prefix for their namespace. [...] Unfortunately, XSLT version 1.0 has no concept similar to a default namespace; therefore, you must repeat namespace prefixes again and again.
根据an answer to a similar question,XPath 2.0 确实包含默认命名空间",上面链接的 XSLT 页面也在 XSLT 2.0 的上下文中提到了这一点.
According to an answer to a similar question, XPath 2.0 does include a notion of "default namespace", and the XSLT page linked above mentions this also in the context of XSLT 2.0.
不幸的是,PHP 中的所有内置 XML 扩展都是基于 libxml2 和libxslt 库,仅支持 1.0 版的 XPath 和 XSLT.
Unfortunately, all of the built-in XML extensions in PHP are built on top of the libxml2 and libxslt libraries, which support only version 1.0 of XPath and XSLT.
因此,除了对文档进行预处理以不使用名称空间之外,您唯一的选择是找到可以插入 PHP 的 XPath 2.0 处理器.
So other than pre-processing the document not to use namespaces, your only option would be to find an XPath 2.0 processor that you could plug in to PHP.
(顺便说一句,值得注意的是,如果您的 XML 文档中有无前缀的 属性,从技术上讲,它们不在默认命名空间中,而是根本不在命名空间中;请参阅 XML 命名空间和无前缀属性 用于讨论命名空间规范的这种奇怪之处.)
(As an aside, it's worth noting that if you have unprefixed attributes in your XML document, they are not technically in the default namespace, but rather in no namespace at all; see XML Namespaces and Unprefixed Attributes for discussion of this oddity of the Namespace spec.)
相关文章