Python XPath 的常见问题与解决方法

2023-04-17 00:00:00 python 解决方法 常见问题
  1. 如何解析XML文件?

使用Python内置的xml库,按照以下步骤进行解析:

1)导入xml.etree.ElementTree

import xml.etree.ElementTree as ET

2)使用parse函数将XML文件解析成一个ElementTree对象

tree = ET.parse('example.xml')

3)获取根节点

root = tree.getroot()

4)利用Element对象中的方法来获取节点,查找数据

for child in root:
print(child.tag, child.attrib)

具体例子如下:



Gambardella, Matthew
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


代码演示:

import xml.etree.ElementTree as ET

tree = ET.parse('example.xml')

root = tree.getroot()

for child in root:
print(child.tag, child.attrib)

输出结果:

book {'id': 'bk001'}
book {'id': 'bk002'}

  1. 如何使用XPath选取节点?

使用Python内置的xpath模块进行节点选取,步骤如下:

1)导入lxml库

from lxml import etree

2)将xml文件解析成一个Element对象

root = etree.parse('example.xml')

3)使用xpath函数进行选取

选取所有book节点

root.xpath('//book')

选取id为bk002的book的作者

root.xpath('//book[@id="bk002"]/author')

选取所有的author元素

root.xpath('//author')

具体例子如下:



Gambardella, Matthew
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


代码演示:

from lxml import etree

tree = etree.parse('example.xml')

选取所有的book节点

books = tree.xpath('//book')
for book in books:
print(book)

选取id为bk002的book的作者

author = tree.xpath('//book[@id="bk002"]/author')
print(author[0].text)

选取所有的author元素

authors = tree.xpath('//author')
for author in authors:
print(author.text)

输出结果:



Ralls, Kim
Gambardella, Matthew
Ralls, Kim

  1. 如何使用XPath选取属性?

使用Python内置的xpath模块进行属性选取,步骤如下:

1)导入lxml库

from lxml import etree

2)将xml文件解析成一个Element对象

root = etree.parse('example.xml')

3)选取属性

选取所有book的id属性

root.xpath('//book/@id')

选取id属性等于bk002的book的作者

root.xpath('//book[@id="bk002"]/author/text()')

具体例子如下:



Gambardella, Matthew
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


代码演示:

from lxml import etree

tree = etree.parse('example.xml')

选取所有book的id属性

ids = tree.xpath('//book/@id')
for id in ids:
print(id)

选取id属性等于bk002的book的作者

author = tree.xpath('//book[@id="bk002"]/author/text()')
print(author[0])

输出结果:

bk001
bk002
Ralls, Kim

  1. 如何使用xpath选取特定元素?

使用以下步骤:

1)使用xpath选取特定的元素

2)根据需要使用某些属性进行筛选

例如,选取所有book标签,并且其中的author标签的文本内容包含“Matthew”:

root.xpath('//book[author[contains(text(),"Matthew")]]')

具体例子如下:



Gambardella, Matthew
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


代码演示:

from lxml import etree

tree = etree.parse('example.xml')

选取所有book标签,并且其中的author标签的文本内容包含“Matthew”

books = tree.xpath('//book[author[contains(text(),"Matthew")]]')
for book in books:
print(book)

输出结果:

  1. 如何修改XML文档?

使用Python内置的xml.etree.ElementTree库中的Element对象的方法进行元素的修改:

1)修改元素text属性

elem.text = "new text"

2)修改元素attribute

elem.set("attrib", "new value")

3)修改元素tag

elem.tag = "new tag"

具体例子如下:



Gambardella, Matthew
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


代码演示:

import xml.etree.ElementTree as ET

tree = ET.parse('example.xml')

root = tree.getroot()

修改第一个book标签的作者

root[0].find('author').text = '皮蛋编程'

在第二个book标签下添加一本新书

new_book = ET.Element("book")
new_book.attrib["id"] = "bk003"

author = ET.SubElement(new_book, "author")
author.text = "pidancode.com"

title = ET.SubElement(new_book, "title")
title.text = "Python XPath tutorial"

genre = ET.SubElement(new_book, "genre")
genre.text = "Computer"

price = ET.SubElement(new_book, "price")
price.text = "9.99"

root[1].append(new_book)

tree.write('example.xml')

输出结果:



皮蛋编程
XML Developer's Guide
Computer
44.95
2000-10-01
An in-depth look at creating applications
with XML.



Ralls, Kim
Midnight Rain
Fantasy
5.95
2000-12-16
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.


pidancode.com
Python XPath tutorial
Computer
9.99


相关文章