在 Python 中通过 ElementTree 解析 xml 时如何保留命名空间

问题描述

假设我有以下想要使用 Python 的 ElementTree 修改的 XML:

Assume that I've the following XML which I want to modify using Python's ElementTree:

<root xmlns:prefix="URI">
  <child company:name="***"/>
  ...
</root> 

我正在对 XML 文件进行一些修改,如下所示:

I'm doing some modification on the XML file like this:

import xml.etree.ElementTree as ET
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')

那么 XML 文件看起来像:

Then the XML file looks like:

<root xmlns:ns0="URI">
  <child ns0:name="***"/>
  ...
</root>

如您所见,namepsace prefix 更改为 ns0.我知道使用 ET.register_namespace() 提到 这里.

As you can see, the namepsace prefix changed to ns0. I'm aware of using ET.register_namespace() as mentioned here.

ET.register_namespace() 的问题在于:

  1. 你需要知道prefixURI
  2. 它不能与默认命名空间一起使用.

例如如果 xml 看起来像:

e.g. If the xml looks like:

<root xmlns="http://uri">
    <child name="name">
    ...
    </child>
</root>

它将被转换为:

<ns0:root xmlns:ns0="http://uri">
    <ns0:child name="name">
    ...
    </ns0:child>
</ns0:root>

如您所见,默认命名空间更改为ns0.

As you can see, the default namespace is changed to ns0.

有没有办法用ElementTree解决这个问题?

Is there any way to solve this problem with ElementTree?


解决方案

ElementTree 将替换那些未使用 ET.register_namespace 注册的命名空间前缀.要保留命名空间前缀,您需要先注册它,然后再将修改写入文件.以下方法完成这项工作并在全局范围内注册所有命名空间,

ElementTree will replace those namespaces' prefixes that are not registered with ET.register_namespace. To preserve a namespace prefix, you need to register it first before writing your modifications on a file. The following method does the job and registers all namespaces globally,

def register_all_namespaces(filename):
    namespaces = dict([node for _, node in ET.iterparse(filename, events=['start-ns'])])
    for ns in namespaces:
        ET.register_namespace(ns, namespaces[ns])

这个方法应该在ET.parse方法之前调用,这样命名空间将保持不变,

This method should be called before ET.parse method, so that the namespaces will remain as unchanged,

import xml.etree.ElementTree as ET
register_all_namespaces('filename.xml')
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')

相关文章