如何旋转 Selenium 网络浏览器 IP 地址

问题描述

我有一个 Python 脚本,它每 30 秒访问一次网站,每次我都需要不同的 IP 地址.

I have a Python script that visits a website every 30 sec, and I would need to have a different IP address each time.

什么是最好/最省时的解决方案?

What would be the best/most time effective solution?

  • 在线抓取免费代理?你知道从多个来源收集代理的 python 脚本吗?

  • scraping free proxies online? Do you know a python script that gather proxies from many sources?

每次使用 Tor 浏览器都有不同的 IP(我在 aws ec2 实例上使用 selenium,你们知道如何在 Ubuntu 服务器上使用 Tor 浏览器的教程吗?)

use Tor browser to have a different IP each time (I'm using selenium on an aws ec2 instance, you guys know a tutorial on how to use Tor browser on Ubuntu server?)

其他方法?


解决方案

要收集和使用不同的代理,一个强大的解决方案是使用在 免费代理列表 使用以下解决方案:

To gather and use different proxies a robust solution would be to make proxied requests to the website using the newly active proxies which gets listed within the Free Proxy List using the following solution:

  • 代码块:

  • Code Block:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:WebDriverschromedriver.exe')
driver.get("https://sslproxies.org/")
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//th[contains(., 'IP Address')]"))))
ips = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 1]")))]
ports = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 2]")))]
driver.quit()
proxies = []
for i in range(0, len(ips)):
    proxies.append(ips[i]+':'+ports[i])
print(proxies)
for i in range(0, len(proxies)):
    try:
        print("Proxy selected: {}".format(proxies[i]))
        options = webdriver.ChromeOptions()
        options.add_argument('--proxy-server={}'.format(proxies[i]))
        driver = webdriver.Chrome(options=options, executable_path=r'C:WebDriverschromedriver.exe')
        driver.get("https://www.whatismyip.com/proxy-check/?iref=home")
        if "Proxy Type" in WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.card-text"))):
            break
    except Exception:
        driver.quit()
print("Proxy Invoked")

  • 控制台输出:

  • Console Output:

    ['190.7.158.58:39871', '175.139.179.65:54980', '186.225.45.146:45672', '185.41.99.100:41258', '43.230.157.153:52986', '182.23.32.66:30898', '36.37.160.253:31450', '93.170.15.214:56305', '36.67.223.67:43628', '78.26.172.44:52490', '36.83.135.183:3128', '34.74.180.144:3128', '206.189.122.177:3128', '103.194.192.42:55546', '70.102.86.204:8080', '117.254.216.97:23500', '171.100.221.137:8080', '125.166.176.153:8080', '185.146.112.24:8080', '35.237.104.97:3128']
    
    Proxy selected: 190.7.158.58:39871
    
    Proxy selected: 175.139.179.65:54980
    
    Proxy selected: 186.225.45.146:45672
    
    Proxy selected: 185.41.99.100:41258
    

  • 相关文章