在Flask框架中使用BeautifulSoup进行网页数据处理和展示

2023-07-30 16:01:18 框架网页数据处理

在Flask框架中使用BeautifulSoup进行网页数据处理和展示的方法如下：

安装BeautifulSoup库

在命令行中输入以下命令：

pip install beautifulsoup4

在Flask应用中导入BeautifulSoup库

from bs4 import BeautifulSoup

获取需要处理的网页内容

使用requests库获取网页内容，例如：

import requests

url = 'https://pidancode.com/'
response = requests.get(url)

使用BeautifulSoup进行数据处理

将获取的网页内容传入BeautifulSoup，并指定解析器，例如：

soup = BeautifulSoup(response.content, 'html.parser')

接着可以使用BeautifulSoup提供的各种方法来查找、提取网页中的数据，例如：

# 获取页面标题
title = soup.title.string

# 查找所有h2标签
h2_list = soup.find_all('h2')

# 查找第一个h2标签
h2_text = soup.find('h2').get_text()

# 查找id为main-content下所有p标签
p_list = soup.select('#main-content p')

在Flask应用中展示处理后的数据

将处理后的数据传递给模板，然后在模板中展示，例如：

from flask import Flask, render_template

app = Flask(__name__)

@app.route('/')
def index():
    url = 'https://pidancode.com/'
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    title = soup.title.string
    h2_list = soup.find_all('h2')
    return render_template('index.html', title=title, h2_list=h2_list)

if __name__ == '__main__':
    app.run()

index.html模板文件：

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>{{ title }}</title>
</head>
<body>
    <h1>{{ title }}</h1>
    <ul>
    {% for h2 in h2_list %}
        <li>{{ h2.get_text() }}</li>
    {% endfor %}
    </ul>
</body>
</html>

这样就可以在浏览器中访问Flask应用，查看展示出来的页面标题和所有h2标签内容了。

相关文章