利用BeautifulSoup和Selenium实现自动化网页爬虫

2023-04-17 00:00:00 爬虫自动化利用

步骤如下：
1. 安装BeautifulSoup和Selenium
可以通过pip install BeautifulSoup和pip install Selenium安装
2. 导入需要的库

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup

启动浏览器

driver = webdriver.Firefox()

打开目标网站

driver.get("http://pidancode.com")

获取网页源代码

html = driver.page_source

解析网页源代码

soup = BeautifulSoup(html, 'html.parser')

找到目标元素
例如，我们想要找到网页中的标题，可以使用下面的代码：

title = soup.find('title')
print(title.text)

关闭浏览器

driver.close()

完整代码示例：

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
# 启动浏览器
driver = webdriver.Firefox()
# 打开目标网站
driver.get("http://pidancode.com")
# 获取网页源代码
html = driver.page_source
# 解析网页源代码
soup = BeautifulSoup(html, 'html.parser')
# 找到网页标题
title = soup.find('title')
print(title.text)
# 关闭浏览器
driver.close()

相关文章