在 Python 中使用 XPath 运算符查询 HTML 数据

2023-04-17 00:00:00 查询数据运算符

导入相关库

import requests
from lxml import etree

请求页面并解析

url = "https://www.pidancode.com"
response = requests.get(url)
html = etree.HTML(response.content)

使用 XPath 运算符查询数据
- 获取页面标题
  python title = html.xpath('//title/text()')[0] print(title)
- 获取页面正文全部文本
  python all_text = html.xpath('string(//body)') print(all_text)
- 获取页面中 class 为 "title" 的第一个 div 元素的文本内容
  python title_text = html.xpath('//div[@class="title"][1]/text()')[0] print(title_text)
- 获取页面中 href 属性为 "/about" 的第一个 a 元素的文本内容
  python about_text = html.xpath('//a[@href="/about"][1]/text()')[0] print(about_text)

相关文章