Python BeautifulSoup find_all() 方法

2023-04-17 00:00:00 python beautifulsoup find

Python的BeautifulSoup库封装了解析HTML和XML文档的函数，包括find_all()方法。该方法可以根据指定的参数从文档中找到所有符合要求的标签。

下面是详细的使用方法：

导入必要的库

from bs4 import BeautifulSoup
import requests

发送请求并得到HTML文档

url = 'https://pidancode.com'
response = requests.get(url)
html_doc = response.text

解析HTML文档

soup = BeautifulSoup(html_doc, 'html.parser')

使用find_all()方法查找标签

soup.find_all('a')  # 查找所有的a标签
soup.find_all('a', href='https://pidancode.com')  # 查找href为'https://pidancode.com'的a标签
soup.find_all(['a', 'img'])  # 查找a标签和img标签
soup.find_all('div', class_='container')  # 查找class为'container'的div标签
soup.find_all('a', string='皮蛋编程')  # 查找文本为'皮蛋编程'的a标签

上述代码中，find_all()方法的第一个参数是要查找的标签名字或标签列表，第二个参数是标签的属性和属性值，第三个参数是标签的文本内容。

注意，在第二个参数中，属性名和属性值要用等号相连，并且必须用单引号或双引号括起来。

最后，上述演示中字符串的应用包括URL和文本内容。

相关文章