Selenium/Python - 提交表单后提取动态生成的 HTML

2022-01-15 00:00:00 python selenium selenium-chromedriver

问题描述

我尝试访问的网页是使用 JavaScript 动态生成 HTML 表单(这个:https://imgur.com/a/rhmXB).输入 print(page_source) 时,表格似乎出现在输出的 HTML 中.

The web page I am trying to access is using JavaScript to dynamically generate HTML form(this one: https://imgur.com/a/rhmXB ). When typing print(page_source), the table seems to appear in the HTML being outputted.

然而,在填写输入字段并提交表单后,会出现另一个带有验证码图像的输入字段(如下所示:https://imgur.com/a/xVfBS ).输入 print(page_source) 后,带有 CAPTCHA 的输入表单似乎没有插入到 HTML 中.

However, after filling the input field and submitting the form, another input field with CAPTCHA image appears(as shown here: https://imgur.com/a/xVfBS ). After typing print(page_source), the input form with the CAPTCHA seems not to be inserted into the HTML.

我的问题是:如何使用 Selenium 访问这个动态生成的 HTML,其中包含输入字段和验证码图像?

My question is: How can I access this dynamically generated HTML, which contains the input field and the CAPTCHA image using Selenium?

这是我的代码(另外,in pastebin):

Here is my code (also, in pastebin):

from selenium import webdriver
driver = webdriver.Chrome("/var/chromedriver/chromedriver")

URL = 'http://nap.bg/link?id=104'
driver.get(URL)

input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()
print(driver.page_source)


解决方案

点击按钮后,页面需要一些时间来加载验证码等内容.您需要等待它完成加载.您可以使用 Selenium 的 显式等待来做到这一点.

After you click on the button, the page takes some time to load the CAPTCHA and other content. You'll need to wait for that to finish loading. You can do that using Selenium's explicit waits.

这是你可以做的一个例子:

This is an example for what you can do:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
URL = 'http://nap.bg/link?id=104'
driver.get(URL)

input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()

wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'ipResponse')))

print(driver.page_source)

相关文章