英文:
Why do I keep getting 'None' as a response while webscraping in PyCharm?
问题
I'm going through the thenewboston Python tutorial for a web crawler, and I'm trying to follow his steps, but I have not been able to get what I want. I want to get all the quotes from this website https://quotes.toscrape.com/page/1/, however, it keeps returning "None."
import requests
from bs4 import BeautifulSoup
def trade_spider(max_pages):
page = 1
while page <= max_pages:
url = 'http://quotes.toscrape.com/page/' + str(page)
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
for link in soup.findAll('div', {'class': 'quote'}):
href = link.get('quote')
print(href)
page += 1
trade_spider(1)
I tried a lot of things, however, I can't really find a YouTube tutorial on it.
英文:
Im going through thenewboston python tutorial for a web crawler and im tryong tp follow his steps but have not been able to get what i want. I want to get all the quotes from this website https://quotes.toscrape.com/page/1/
however it keeps returning back "None"
`import requests
from bs4 import BeautifulSoup
def trade_spider(max_pages):
page = 1
while page <= max_pages:
url = 'http://quotes.toscrape.com/page/' + str(page)
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
for link in soup.findAll('div', {'class': 'quote'}):
href = link.get('quote')
print(href)
page += 1
trade_spider(1)`
I tried a lot of things however cant really find a youtube tutorial on it.
答案1
得分: 0
Your bug lies with the line
href = link.get('quote')
link
is of type Tag
. You are calling the get
method on it, which, according to the documentation, returns the value of the corresponding attribute. However, when you print your link
variable, you can see that it is a div
and does not have the quote
attribute. Instead, you can access its span
subtag to extract the quotes:
span = link.find('span', {'class': 'text'})
quote = span.text
print(quote)
英文:
Your bug lies with the line
href = link.get('quote')
link
is of type Tag
. You are calling the get
method on it, which, according to the documentation, returns the value of the corresponding attribute. However, when you print your link
variable, you can see that it is a div
and does not have the quote
attribute. Instead, you can access its span
subtag to extract the quotes:
span = link.find('span', {'class': 'text'})
quote = span.text
print(quote)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论