英文:
Setting the proxy using Selenium and Docker
问题
我在使用代理进行网络爬虫时遇到了问题。我使用了容器化的Python代码和selenium/standalone-chrome
镜像。我尝试了类似以下的代码来传递参数,但Chrome实例似乎忽略了它。我有一个示例爬虫,从ident.me网页上抓取IP地址,但它返回了我的机器IP。
def get_chrome_driver(proxy):
proxy = str(proxy)
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy=%s' % proxy)
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Remote(
command_executor='http://chrome:4444/wd/hub',
options=chrome_options
)
return driver
英文:
I have a trouble during using proxy for scraping. I use dockerized Python code and
selenium/standalone-chrome
image.
I tried something like this
def get_chrome_driver(proxy):
proxy = str(proxy)
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy=%s' % proxy)
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Remote(
command_executor='http://chrome:4444/wd/hub',
options=webdriver.ChromeOptions()
)
return driver
to pass the parameters but the Chrome instance seems to ignore it. I have example scraper scraping IP address from ident.me webpage and it returns my machine's IP.
答案1
得分: 1
你正在使用这行代码为驱动程序实例保存默认选项
options=webdriver.ChromeOptions()
你需要设置你创建的选项
options=chrome_options
英文:
you are saving default options with this line for the driver instance
options=webdriver.ChromeOptions()
you need to set your created options
options=chrome_options
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论