英文:
Windows Python Selenium chrome binary issue? No chrome binary at C:\Program Files\Google\Chrome\Application\chrome.exe
问题
以下是您提供的代码的翻译部分:
!pip install selenium
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import pandas as pd
import os
# 设置ChromeDriver可执行文件的路径
chromedriver_path = "C:\\Users\\5mxz2\\Downloads\\chromedriver_win32\\chromedriver"
# 设置Chrome二进制文件的路径
chrome_binary_path = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" # 请更新为您的Chrome二进制文件的正确路径
# 设置要爬取的Yelp页面的URL
url = "https://www.yelp.com/biz/gelati-celesti-virginia-beach-2"
# 设置Chrome的选项
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless") # 以无头模式运行Chrome,如果想看浏览器窗口,请注释掉此行
chrome_options.binary_location = chrome_binary_path
# 创建ChromeDriver服务
service = Service(chromedriver_path)
# 创建ChromeDriver实例
driver = webdriver.Chrome(service=service, options=chrome_options)
# 加载Yelp页面
driver.get(url)
# 提取页面源代码并传递给BeautifulSoup
soup = BeautifulSoup(driver.page_source, "html.parser")
# 在页面上查找所有评论元素
reviews = soup.find_all("div", class_="review")
# 创建空列表来存储提取的数据
review_texts = []
ratings = []
dates = []
# 遍历每个评论元素
for review in reviews:
# 提取评论文本
review_text = review.find("p", class_="comment").get_text()
review_texts.append(review_text.strip())
# 提取评分
rating = review.find("div", class_="rating").get("aria-label")
ratings.append(rating)
# 提取日期
date = review.find("span", class_="rating-qualifier").get_text()
dates.append(date.strip())
# 从提取的数据创建一个DataFrame
data = {
"评论文本": review_texts,
"评分": ratings,
"日期": dates
}
df = pd.DataFrame(data)
# 打印DataFrame
print(df)
# 获取当前工作目录
path = os.getcwd()
# 将DataFrame保存为CSV文件
csv_path = os.path.join(path, "yelp_reviews.csv")
df.to_csv(csv_path, index=False)
# 关闭ChromeDriver实例
driver.quit()
这是您提供的代码的翻译部分。如果您有任何其他问题或需要进一步的帮助,请随时提出。
英文:
So I'm fairly new to coding and I am supposed to be parsing Yelp reviews so I can analyze the data using Pandas. I have been trying to use selenium/beautifulsoup to automate the whole process, but I can't get past the chrome binary location errors in each version of the code I make. I feel like I've tried everything, can someone please tell me what I'm doing wrong?
!pip install selenium
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import pandas as pd
import os
# Set the path to the ChromeDriver executable
chromedriver_path = "C:\\Users\\5mxz2\\Downloads\\chromedriver_win32\\chromedriver"
# Set the path to the Chrome binary
chrome_binary_path = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" # Update this with the correct path to your Chrome binary
# Set the URL of the Yelp page you want to scrape
url = "https://www.yelp.com/biz/gelati-celesti-virginia-beach-2"
# Set the options for Chrome
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless") # Run Chrome in headless mode, comment this line if you want to see the browser window
chrome_options.binary_location = chrome_binary_path
# Create the ChromeDriver service
service = Service(chromedriver_path)
# Create the ChromeDriver instance
driver = webdriver.Chrome(service=service, options=chrome_options)
# Load the Yelp page
driver.get(url)
# Extract the page source and pass it to BeautifulSoup
soup = BeautifulSoup(driver.page_source, "html.parser")
# Find all review elements on the page
reviews = soup.find_all("div", class_="review")
# Create empty lists to store the extracted data
review_texts = []
ratings = []
dates = []
# Iterate over each review element
for review in reviews:
# Extract the review text
review_text = review.find("p", class_="comment").get_text()
review_texts.append(review_text.strip())
# Extract the rating
rating = review.find("div", class_="rating").get("aria-label")
ratings.append(rating)
# Extract the date
date = review.find("span", class_="rating-qualifier").get_text()
dates.append(date.strip())
# Create a DataFrame from the extracted data
data = {
"Review Text": review_texts,
"Rating": ratings,
"Date": dates
}
df = pd.DataFrame(data)
# Print the DataFrame
print(df)
# Get the current working directory
path = os.getcwd()
# Save the DataFrame as a CSV file
csv_path = os.path.join(path, "yelp_reviews.csv")
df.to_csv(csv_path, index=False)
# Close the ChromeDriver instance
driver.quit()
That's what I have so far but I keep getting this error message
WebDriverException Traceback (most recent call last)
<ipython-input-11-6c92e956c704> in <cell line: 27>()
25
26 # Create the ChromeDriver instance
---> 27 driver = webdriver.Chrome(service=service, options=chrome_options)
28
29 # Load the Yelp page
5 frames
/usr/local/lib/python3.10/dist-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
243 alert_text = value["alert"].get("text")
244 raise exception_class(message, screen, stacktrace, alert_text) # type: ignore[call-arg] # mypy is not smart enough here
--> 245 raise exception_class(message, screen, stacktrace)
WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application\chrome.exe
Stacktrace:
#0 0x55f8912b24e3 <unknown>
#1 0x55f890fe1c76 <unknown>
#2 0x55f8910085e0 <unknown>
#3 0x55f891007029 <unknown>
#4 0x55f891045ccc <unknown>
#5 0x55f89104547f <unknown>
#6 0x55f89103cde3 <unknown>
#7 0x55f8910122dd <unknown>
#8 0x55f89101334e <unknown>
#9 0x55f8912723e4 <unknown>
#10 0x55f8912763d7 <unknown>
#11 0x55f891280b20 <unknown>
#12 0x55f891277023 <unknown>
#13 0x55f8912451aa <unknown>
#14 0x55f89129b6b8 <unknown>
#15 0x55f89129b847 <unknown>
#16 0x55f8912ab243 <unknown>
#17 0x7ff7aa929609 start_thread```
</details>
# 答案1
**得分**: 1
以下是翻译好的内容:
1. 检查以下事项:
1. 检查指定的文件是否存在。Python:
```python
import os
print(os.path.exists(chrome_binary_path)) # 检查路径是否存在。
# 如果文件存在,则打印 'True',否则打印 'False'
或者一行代码:
print(__import__("os").path.exists(chrome_binary_path))
-
检查是否具有打开该文件的权限。
按下[Win] + [R],然后粘贴文件路径。
如果您没有读取文件的权限,将会出现一个包含"Permission denied"(权限被拒绝)的错误消息框。 -
检查硬盘是否已损坏。
https://www.avast.com/c-chkdsk-windows
英文:
Check the following things:
- Check if the specified file exists. Python:
import os
print(os.path.exists(chrome_binary_path)) # checks if the path exists.
# prints 'True' if the file exists, else it prints 'False'
or one-line:
print(__import__("os").path.exists(chrome_binary_path))
-
Check if you have permission to open that file.
Press [Win] + [R] and paste the file path there.
A message box with an error like "Permission denied" should appear if you have no permission to read the file. -
Check if the hard drive is corrupted.
https://www.avast.com/c-chkdsk-windows
答案2
得分: 1
这个错误消息...
WebDriverException: 信息:未知错误:在C:\Program Files\Google\Chrome\Application\chrome.exe找不到Chrome二进制文件
...暗示在通过 chrome_binary_path
指定的位置找不到 Chrome 二进制文件:
chrome_binary_path = "C:\Program Files\Google\Chrome\Application\chrome.exe" # 请将此路径更新为您的 Chrome 二进制文件的正确路径
可能是 [tag:google-chrome] 安装在自定义位置。
解决方案
您需要通过属性 binary_location
传递 Chrome 二进制文件的确切位置,如下所示:
chrome_options.binary_location = "C:\path\to\chrome.exe";
参考
您可以在以下链接中找到一些相关的详细讨论:
- 使用Selenium在Python中针对较旧版本的Google Chrome解决WebDriverException:未知错误:无法找到Chrome二进制文件错误
- 在使用Selenium时是否需要安装Chrome,还是只需要chromedriver?
英文:
This error message...
WebDriverException: Message: unknown error: no chrome binary at C:\Program Files\Google\Chrome\Application\chrome.exe
...implies that no chrome binary was found at the location mentioned through chrome_binary_path
:
chrome_binary_path = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" # Update this with the correct path to your Chrome binary
Possibly, [tag:google-chrome] was installed at a custom location.
Solution
You need to pass the exact location of the chrome binary through the attribute binary_location
as follows:
chrome_options.binary_location = "C:\\path\\to\\chrome.exe"
Reference
You can find a couple of relevant detailed discussions in:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论