python通过BeautifulSoup分页网页中的超级链接

2022-04-27 00:00:00 分页网页超级链接

python通过BeautifulSoup分页网页中的超级链接，这段python代码输出www.pidancode.com主页上所有包含了a/的url链接

"""
皮蛋编程(https://www.pidancode.com)
创建日期：2022/4/1
功能描述：python通过BeautifulSoup分页网页中的超级链接
"""
from bs4 import BeautifulSoup
import urllib.request
import re

url = urllib.request.urlopen("https://www.pidancode.com")
content = url.read()
soup = BeautifulSoup(content, features='lxml')
for a in soup.findAll('a', href=True):
    if re.findall('a/', a['href']):
        print("找到链接:", a['href'])

返回结果：
找到链接: /a/16469945463929798.html
找到链接: /a/16469945463795168.html
找到链接: /a/16469945463649652.html
找到链接: /a/16469945463624250.html
找到链接: /a/16469945463595160.html
...

代码在python3.9下测试通过。

相关文章