Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

huangapple go评论75阅读模式
英文:

Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

问题

以下是已经翻译好的内容:

我正在尝试在Azure Databricks中使用Selenium在Chrome中进行网页抓取。请找到下面的代码。

  1. %pip install selenium
  2. %pip install webdriver_manager
  3. from selenium import webdriver
  4. from selenium.webdriver import Chrome
  5. from selenium.webdriver.chrome.service import Service
  6. from selenium.webdriver.common.by import By
  7. from webdriver_manager.chrome import ChromeDriverManager
  8. from selenium.webdriver.support.wait import WebDriverWait
  9. from selenium.webdriver.support import expected_conditions as ExpectedConditions
  10. from selenium.webdriver.chrome.options import Options
  11. # 指定上传的chromedriver文件路径
  12. chrome_driver_path = '/dbfs/FileStore/Chromedriver/chromedriver'
  13. chrome_service = Service(chrome_driver_path)
  14. # 配置Chrome选项
  15. options = Options()
  16. options.binary_location = "C:\Program Files\Google\Chrome\Application"
  17. options.add_argument('--headless') # 在无界面模式下运行Chrome(无GUI)
  18. options.add_argument("--no-sandbox")
  19. options.add_argument("--disable-dev-shm-usage")
  20. options.add_argument("--disable-gpu")
  21. # 创建一个新的Chrome webdriver实例
  22. driver = webdriver.Chrome(service=chrome_service, options=options)
  23. # 示例用法:打开网站并打印页面标题
  24. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  25. driver.get(url)
  26. # 清理并退出webdriver
  27. driver.quit()

但是我遇到了以下错误 -
WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application
Stacktrace:

英文:

I am trying to do webscraping using Selenium in Chrome within Azure Databricks. Please find the below code.

  1. %pip install selenium
  2. %pip install webdriver_manager
  3. from selenium import webdriver
  4. from selenium.webdriver import Chrome
  5. from selenium.webdriver.chrome.service import Service
  6. from selenium.webdriver.common.by import By
  7. from webdriver_manager.chrome import ChromeDriverManager
  8. from selenium.webdriver.support.wait import WebDriverWait
  9. from selenium.webdriver.support import expected_conditions as ExpectedConditions
  10. from selenium.webdriver.chrome.options import Options
  11. # Specify the path to the uploaded chromedriver file
  12. chrome_driver_path = '/dbfs/FileStore/Chromedriver/chromedriver'
  13. chrome_service = Service(chrome_driver_path)
  14. # Configure Chrome options
  15. options = Options()
  16. options.binary_location = "C:\Program Files\Google\Chrome\Application"
  17. options.add_argument('--headless') # Run Chrome in headless mode (without GUI)
  18. options.add_argument("--no-sandbox")
  19. options.add_argument("--disable-dev-shm-usage")
  20. options.add_argument("--disable-gpu")
  21. # Create a new Chrome webdriver instance
  22. driver = webdriver.Chrome(service=chrome_service, options=options)
  23. # Example usage: Open a website and print the page title
  24. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  25. driver.get(url)
  26. # Clean up and quit the webdriver
  27. driver.quit()

However I am getting below error -
WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application
Stacktrace:

答案1

得分: 1

请参考以下代码。

  1. # 创建选项对象
  2. options = Options()
  3. options.add_argument('--headless') # 无头模式运行
  4. options.add_argument("--no-sandbox")
  5. options.add_argument("--disable-dev-shm-usage")
  6. options.add_argument("--disable-gpu")
  7. # 使用shell命令下载Chromedriver版本113并保存到指定路径
  8. %sh
  9. wget -N https://chromedriver.storage.googleapis.com/113.0.5672.63/chromedriver_linux64.zip -O /tmp/chromedriver_linux64.zip
  10. # 解压文件
  11. %sh
  12. unzip /tmp/chromedriver_linux64.zip -d /tmp/chromedriver113/
  13. # 安装Chrome版本113
  14. %sh
  15. sudo curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
  16. sudo echo "deb https://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
  17. sudo apt-get -y update
  18. sudo apt-get -y install google-chrome-stable
  19. # 使用webdriver启动Chrome浏览器
  20. browser = webdriver.Chrome(service=Service('/tmp/chromedriver113/chromedriver'), options=options)
  21. # 打开指定网址
  22. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  23. browser.get(url)
  24. # 获取浏览器标题
  25. browser.title

请参考此解决方案获取更多信息。

英文:

Try below code.

  1. options = Options()
  2. options.add_argument('--headless')
  3. options.add_argument("--no-sandbox")
  4. options.add_argument("--disable-dev-shm-usage")
  5. options.add_argument("--disable-gpu")

Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

Using shell command save the chromedriver version 113 in
/tmp/chromedriver_linux64.zip.

  1. %sh
  2. wget -N https://chromedriver.storage.googleapis.com/113.0.5672.63/chromedriver_linux64.zip -O /tmp/chromedriver_linux64.zip

Unzip the file.

Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

  1. %sh
  2. unzip /tmp/chromedriver_linux64.zip -d /tmp/chromedriver113/

Install chrome version 113.

  1. %sh
  2. sudo curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
  3. sudo echo "deb https://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
  4. sudo apt-get -y update
  5. sudo apt-get -y install google-chrome-stable

Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

Know get the data from url.

  1. browser = webdriver.Chrome(service=Service('/tmp/chromedriver113/chromedriver'), options=options)
  2. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  3. browser.get(url)
  4. browser.title

Error in Databricks Selenium "WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application Stacktrace:"

Follow this solution for more information.

答案2

得分: 0

尝试以下内容:

  1. %pip install selenium
  2. %pip install webdriver_manager
  3. from selenium import webdriver
  4. from selenium.webdriver import Chrome
  5. from selenium.webdriver.chrome.service import Service
  6. from selenium.webdriver.common.by import By
  7. from webdriver_manager.chrome import ChromeDriverManager
  8. from selenium.webdriver.support.wait import WebDriverWait
  9. from selenium.webdriver.support import expected_conditions as ExpectedConditions
  10. from selenium.webdriver import ChromeOptions
  11. options = ChromeOptions()
  12. options.add_argument('--headless')
  13. options.add_argument("--no-sandbox")
  14. options.add_argument("--disable-dev-shm-usage")
  15. options.add_argument("--disable-gpu")
  16. # 创建一个新的 Chrome webdriver 实例
  17. driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
  18. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  19. driver.get(url)
  20. driver.quit()

您的代码问题在于,您将 Chrome 驱动程序指向了一个 Windows 路径(C:\Program Files\Google\Chrome\Application),而这在 Databricks 工作空间中当然是不存在的。

英文:

Try the following:

  1. %pip install selenium
  2. %pip install webdriver_manager
  3. from selenium import webdriver
  4. from selenium.webdriver import Chrome
  5. from selenium.webdriver.chrome.service import Service
  6. from selenium.webdriver.common.by import By
  7. from webdriver_manager.chrome import ChromeDriverManager
  8. from selenium.webdriver.support.wait import WebDriverWait
  9. from selenium.webdriver.support import expected_conditions as ExpectedConditions
  10. from selenium.webdriver import ChromeOptions
  11. options = ChromeOptions()
  12. options.add_argument('--headless')
  13. options.add_argument("--no-sandbox")
  14. options.add_argument("--disable-dev-shm-usage")
  15. options.add_argument("--disable-gpu")
  16. # Create a new Chrome webdriver instance
  17. driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options= options)
  18. url = "https://data.cms.gov/tools/mapping-medicare-disparities-by-population"
  19. driver.get(url)
  20. driver.quit()

The issue with your code is that you are pointing the chrome driver to a Windows path (C:\Program Files\Google\Chrome\Application), which of course does not exist in the Databricks workspace.

huangapple
  • 本文由 发表于 2023年5月29日 16:01:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76355603.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定