如何使用Selenium从亚马逊网站获取价格。

huangapple go评论127阅读模式
英文:

How to get price from Amazon website by selenium

问题

#---------------------------------------------------------------------------------
chrome_driver_path = r"C:\Users\user\Desktop\Development\chromedriver.exe"
service = Service('C:\Program Files\Chrome Driver\chromedriver.exe')

driver = webdriver.Chrome(service=service)

driver.get("https://www.amazon.com/Apple-MacBook-16-Inch-Storage-2-6GHz/dp/B08CZT64VP/ref=sr_1_4?crid=2T5BGI71K1C8W&keywords=macbook&qid=1689176236&sprefix=macboo%2Caps%2C357&sr=8-4")

price = driver.find_element(By.CLASS_NAME,"a-offscreen").get_attribute("aria-hidden")

print(price)
#---------------------------------------------------------------------------------
英文:

如何使用Selenium从亚马逊网站获取价格。

I want to get the price from the Amazon website.

How can I take the price number?

#---------------------------------------------------------------------------------
chrome_driver_path = r"C:\Users\user\Desktop\Development\chromedriver.exe"
service = Service('C:\Program Files\Chrome Driver\chromedriver.exe')

driver = webdriver.Chrome(service=service)

driver.get("https://www.amazon.com/Apple-MacBook-16-Inch-Storage-2-6GHz/dp/B08CZT64VP/ref=sr_1_4?crid=2T5BGI71K1C8W&keywords=macbook&qid=1689176236&sprefix=macboo%2Caps%2C357&sr=8-4")

price = driver.find_element(By.CLASS_NAME,"a-offscreen").get_attribute("aria-hidden")

print(price)
#---------------------------------------------------------------------------------

#output: None

答案1

得分: 0

由于存在多个具有相同类的标签,可能无法获取所需的标签。此外,您试图获取一个元素根本没有的属性。

因此,请更改以下行:

price = driver.find_element(By.CLASS_NAME,"a-offscreen").get_attribute("aria-hidden")

为:

price = driver.find_element(By.CSS_SELECTOR,"span.a-price.a-text-price.a-size-medium span.a-offscreen").get_attribute("innerText")

输出:

$849.69
英文:

There are multiple tags with the same class, due to which you may not be able to get the tag that you need. Also you were trying to get an attribute which the element doesn't even have.

Therefore change line

price = driver.find_element(By.CLASS_NAME,"a-offscreen").get_attribute("aria-hidden")

to

price = driver.find_element(By.CSS_SELECTOR,"span.a-price.a-text-price.a-size-medium span.a-offscreen").get_attribute("innerText")

Output:

$849.69

答案2

得分: 0

您的程序返回None,因为find_element()返回第一个span元素(具有a-offscreen类),该元素没有aria-label属性;然后,默认情况下,如果没有找到aria-label属性,get_attribute("aria-label")也会返回None。如果您想要找到价格文本,请使用span对象的.text属性,例如price = driver.find_element(By.CLASS_NAME, "a-offscreen").text

您可以参考https://pythonexamples.org/python-selenium-find-element-by-css-selector/。

英文:

Your program returns None because the find_element() returns the first span (with the a-offscreen class), that does not have an aria-label; then by default if one was not found get_attribute("aria-label") returns None. If you want to find the price text use the .text attribute of the span object, eg price = driver.find_element(By.CLASS_NAME,"a-offscreen").text

You may refer to https://pythonexamples.org/python-selenium-find-element-by-css-selector/

答案3

得分: 0

要提取文本 $845.69,理想情况下,你需要使用 WebDriverWait 来等待 visibility_of_element_located(),然后你可以使用以下 定位策略

  • 使用 CSS_SELECTORtext 属性:

    driver.get("https://www.amazon.com/Apple-MacBook-16-Inch-Storage-2-6GHz/dp/B08CZT64VP/ref=sr_1_4?crid=2T5BGI71K1C8W&keywords=macbook&qid=1689176236&sprefix=macboo%2Caps%2C357&sr=8-4")
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#apex_desktop div#corePrice_desktop table.a-lineitem span.apexPriceToPay[data-a-color='price']"))).text)
    
  • 控制台输出:

    $845.69
    
  • 注意:你需要添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

>你可以在 如何使用Selenium - Python检索WebElement的文本 中找到相关讨论。

英文:

To extract the text $845.69 ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use the following locator strategies:

  • Using CSS_SELECTOR and text attribute:

    driver.get("https://www.amazon.com/Apple-MacBook-16-Inch-Storage-2-6GHz/dp/B08CZT64VP/ref=sr_1_4?crid=2T5BGI71K1C8W&keywords=macbook&qid=1689176236&sprefix=macboo%2Caps%2C357&sr=8-4")
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#apex_desktop div#corePrice_desktop table.a-lineitem span.apexPriceToPay[data-a-color='price']"))).text)
    
  • Console output:

    $845.69
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

>You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

huangapple
  • 本文由 发表于 2023年7月12日 23:48:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76672395.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定