PyQt4 到 PyQt5 ->mainFrame() 已弃用,需要修复才能加载网页
问题描述
我正在学习 Sentdex 的 PyQt4 YouTube 教程
现在我正在尝试使用 PyQt5 编写相同的应用程序,这就是我所拥有的:
导入操作系统导入系统从 PyQt5.QtWidgets 导入 QApplication从 PyQt5.QtCore 导入 QUrl,QEventLoop从 PyQt5.QtWebEngineWidgets 导入 QWebEnginePage从 bs4 导入 BeautifulSoup导入请求类客户端(QWebEnginePage):def __init__(self, url):self.app = QApplication(sys.argv)QWebEnginePage.__init__(self)self.loadFinished.connect(self._loadFinished)self.load(QUrl(url))self.app.exec_()def _loadFinished(self):self.app.quit()url = 'https://pythonprogramming.net/parsememcparseface/'client_response = 客户端(网址)#我认为问题出在第 26 行源 = client_response.mainFrame().toHtml()汤= BeautifulSoup(来源,html.parser")js_test = soup.find('p', class_='jstest')打印(js_test.text)
当我运行它时,我收到消息:
<块引用>source = client_response.mainFrame().toHtml()AttributeError:客户端"对象没有属性主框架"
我尝试了几种不同的解决方案,但都没有奏效.任何帮助将不胜感激.
编辑
在第 15 行记录 QUrl(url) 返回此值:
<块引用>PyQt5.QtCore.QUrl('https://pythonprogramming.net/parsememcparseface/')
当我为第 26 行尝试 source = client_response.load(QUrl(url))
时,我最终得到以下消息:
文件test3.py",第 28 行,在 <module>汤= BeautifulSoup(来源,html.parser")文件/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py",第 192 行,在 __init__elif len(标记) <= 256 和 (TypeError:NoneType"类型的对象没有 len()
当我尝试 source = client_response.url()
我得到:
soup = BeautifulSoup(source, "html.parser")文件/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py",第 192 行,在 __init__elif len(标记) <= 256 和 (TypeError:QUrl"类型的对象没有 len()
解决方案 你必须在类的定义中调用QWebEnginePage::toHtml()
.QWebEnginePage::toHtml()
接受一个指针函数或一个 lambda 作为参数,而这个指针函数又必须接受一个 'str' 类型的参数(这是包含页面 html 的参数).下面是示例代码.
将 bs4 导入为 bs导入系统导入 urllib.request从 PyQt5.QtWebEngineWidgets 导入 QWebEnginePage从 PyQt5.QtWidgets 导入 QApplication从 PyQt5.QtCore 导入 QUrl类页面(QWebEnginePage):def __init__(self, url):self.app = QApplication(sys.argv)QWebEnginePage.__init__(self)self.html = ''self.loadFinished.connect(self._on_load_finished)self.load(QUrl(url))self.app.exec_()def _on_load_finished(self):self.html = self.toHtml(self.Callable)print('加载完成')def 可调用(自我,html_str):self.html = html_strself.app.quit()定义主():page = Page('https://pythonprogramming.net/parsememcparseface/')汤 = bs.BeautifulSoup(page.html, 'html.parser')js_test = soup.find('p', class_='jstest')打印 js_test.text如果 __name__ == '__main__': main()
I'm doing Sentdex's PyQt4 YouTube tutorial right here. I'm trying to follow along but use PyQt5 instead. It's a simple web scraping app. I followed along with Sentdex's tutorial and I got here:
Now I'm trying to write the same application with PyQt5 and this is what I have:
import os
import sys
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl, QEventLoop
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from bs4 import BeautifulSoup
import requests
class Client(QWebEnginePage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebEnginePage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.load(QUrl(url))
self.app.exec_()
def _loadFinished(self):
self.app.quit()
url = 'https://pythonprogramming.net/parsememcparseface/'
client_response = Client(url)
#I think the issue is here at LINE 26
source = client_response.mainFrame().toHtml()
soup = BeautifulSoup(source, "html.parser")
js_test = soup.find('p', class_='jstest')
print(js_test.text)
When I run this, I get the message:
source = client_response.mainFrame().toHtml() AttributeError: 'Client' object has no attribute 'mainFrame'
I've tried a few different solutions but none work. Any help would be appreciated.
EDIT
Logging QUrl(url) on line 15 returns this value:
PyQt5.QtCore.QUrl('https://pythonprogramming.net/parsememcparseface/')
When I try source = client_response.load(QUrl(url))
for line 26, I end up with the message:
File "test3.py", line 28, in <module> soup = BeautifulSoup(source, "html.parser") File "/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py", line 192, in __init__ elif len(markup) <= 256 and ( TypeError: object of type 'NoneType' has no len()
When I try source = client_response.url()
I get:
soup = BeautifulSoup(source, "html.parser") File "/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py", line 192, in __init__ elif len(markup) <= 256 and ( TypeError: object of type 'QUrl' has no len()
解决方案
you must call the QWebEnginePage::toHtml()
inside the definition of the class. QWebEnginePage::toHtml()
takes a pointer function or a lambda as a parameter, and this pointer function must in turn take a parameter of 'str' type (this is the parameter that contains the page's html). Here is sample code below.
import bs4 as bs
import sys
import urllib.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
class Page(QWebEnginePage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebEnginePage.__init__(self)
self.html = ''
self.loadFinished.connect(self._on_load_finished)
self.load(QUrl(url))
self.app.exec_()
def _on_load_finished(self):
self.html = self.toHtml(self.Callable)
print('Load finished')
def Callable(self, html_str):
self.html = html_str
self.app.quit()
def main():
page = Page('https://pythonprogramming.net/parsememcparseface/')
soup = bs.BeautifulSoup(page.html, 'html.parser')
js_test = soup.find('p', class_='jstest')
print js_test.text
if __name__ == '__main__': main()
相关文章