自动化无聊的工作 – 使用Selenium进行网页抓取不起作用

huangapple go评论78阅读模式
英文:

automate the boring stuff - webscraping with selenium not working

问题

I'm trying my way through the book, but stuck on selenium here:

Link to the page in the book:
https://automatetheboringstuff.com/2e/chapter12/

To skip forward to the part with the code that is not working for me, look for:
"For example, open a new file editor tab and enter the following program: "

Now I realize that this code is not current as of 2023, but how would it have to look like to work? I've tried all afternoon with 2022/2023 tutorials found online, but no success. Thanks for any guidance you can give.

Per the website https://inventwithpython.com - if I use dev tools and element inspection + hover over "the recursive book of recursion," I see:
自动化无聊的工作 – 使用Selenium进行网页抓取不起作用
The book is only looking for "cover-thumb," I've also tried "card-img-top cover-thumb" and "card-img-top.cover-thumb."

Tried the following solutions from the web (amended to the case I'm looking for):
selenium-python.readthedocs.io

content = driver.find_element(By.CLASS_NAME, 'content')
elem = driver.find_element(By.CLASS_NAME, 'card-img-top.cover-thumb')

Tried this recent solution proposed on Stack overflow:
https://stackoverflow.com/questions/71989191/how-do-i-find-element-by-class-name-in-selenium

link_elements = find_elements(By.CLASS_NAME, "BM30N")
elem = find_elements(By.CLASS_NAME, "card-img-top cover-thumb") (also tried with a period instead of space)

even ChatGPT let me down - (Prompt: Tell me the python code to pull the class-name "card-img-top cover-thumb" via selenium from https://inventwithpython.com):

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://inventwithpython.com")

element = driver.find_element_by_class_name("card-img-top.cover-thumb")

class_name = element.get_attribute("class")
print(class_name)

driver.quit()

and of course looked at this too:
https://www.reddit.com/r/inventwithpython/comments/8ykq1i/automate_the_boring_stuff_with_python_corrections/

英文:

I'm trying my way through the book, but stuck on selenium here:

Link to the page in the book:
https://automatetheboringstuff.com/2e/chapter12/

To skip forward to the part with the code that is not working for me, look for:
"For example, open a new file editor tab and enter the following program: "

Now I realize that this code is not current as of 2023, but how would it have to look like to work? I've tried all afternoon with 2022/2023 tutorials found online, but no success. Thanks for any guidance you can give.

Per the website https://inventwithpython.com - if I use dev tools and element inspection + hover over "the recursive book of recursion", I see:
<img alt="Cover of The Recursive Book of Recusrion" src="images/cover_recursion_thumb.webp">
The book is only looking for " cover-thumb", I've also tried "card-img-top cover-thumb" and "card-img-top.cover-thumb"

Tried the following solutions from the web (amended to the case I'm looking for):
selenium-python.readthedocs.io

content = driver.find_element(By.CLASS_NAME, &#39;content&#39;)
elem = driver.find_element(By.CLASS_NAME, &#39;card-img-top.cover-thumb&#39;)

Tried this recent solution proposed on Stack overflow:
https://stackoverflow.com/questions/71989191/how-do-i-find-element-by-class-name-in-selenium

link_elements = find_elements(By.CLASS_NAME, &quot;BM30N&quot;)
elem = find_elements(By.CLASS_NAME, &quot;card-img-top cover-thumb&quot;) (also tried with a period instead of space)

even ChatGPT let me down - (Prompt: Tell me the python code to pull the class-name "card-img-top cover-thumb" via selenium from https://inventwithpython.com):

from selenium import webdriver

driver = webdriver.Firefox()
driver.get(&quot;https://inventwithpython.com&quot;)

element = driver.find_element_by_class_name(&quot;card-img-top.cover-thumb&quot;)

class_name = element.get_attribute(&quot;class&quot;)
print(class_name)

driver.quit()

and of course looked at this too:
https://www.reddit.com/r/inventwithpython/comments/8ykq1i/automate_the_boring_stuff_with_python_corrections/

答案1

得分: 0

以下是您要翻译的内容:

For Chrome:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://inventwithpython.com")

element = driver.find_element(By.CLASS_NAME, "card-img-top.cover-thumb")

class_name = element.get_attribute("class")
print(class_name)

output:

card-img-top cover-thumb

For FireFox:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Firefox()
driver.get("https://inventwithpython.com")

element = driver.find_element(By.CLASS_NAME, "card-img-top.cover-thumb")

class_name = element.get_attribute("class")
print(class_name)

output:

card-img-top cover-thumb
英文:

Did you try this way:

For Chrome:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get(&quot;https://inventwithpython.com&quot;)

element = driver.find_element(By.CLASS_NAME, &quot;card-img-top.cover-thumb&quot;)

class_name = element.get_attribute(&quot;class&quot;)
print(class_name)

output:

card-img-top cover-thumb

For FireFox:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Firefox()
driver.get(&quot;https://inventwithpython.com&quot;)

element = driver.find_element(By.CLASS_NAME, &quot;card-img-top.cover-thumb&quot;)

class_name = element.get_attribute(&quot;class&quot;)
print(class_name)

output:

card-img-top cover-thumb

huangapple
  • 本文由 发表于 2023年6月19日 02:27:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76502016.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定