尝试在 HTML 元素上使用 BeautifulSoup4 运行 for 循环,但它不会迭代。

huangapple go评论92阅读模式
英文:

Trying to run a for loop on a html element while using bs4 but it does not iterate

问题

  1. from bs4 import BeautifulSoup
  2. import requests
  3. # 发送 HTTP 请求并获取页面内容
  4. html_text = requests.get('https://www.freshplaza.com/europe/content/retailers/').text
  5. # 使用 BeautifulSoup 解析页面
  6. soup = BeautifulSoup(html_text, 'lxml')
  7. # 找到包含零售商信息的 div 元素
  8. retailer_links = soup.find_all('div', {'id': 'retailers'})
  9. # 遍历零售商信息并打印 h2 标签的文本
  10. for el in retailer_links:
  11. print(el.h2.text)
  12. # 检查是否有零售商名称为 'Afghanistan'
  13. for el in retailer_links:
  14. if el.h2.text == 'Afghanistan':
  15. print(True)
  16. else:
  17. print(False)
  18. # 检查是否有零售商名称为 'Hong Kong'
  19. for el in retailer_links:
  20. if el.h2.text == 'Hong Kong':
  21. print(True)
  22. else:
  23. print(False)

这段代码尝试使用 BeautifulSoup 库从给定的网站中获取国家名称、市场名称和链接。它首先发送 HTTP 请求以获取页面内容,然后使用 BeautifulSoup 进行解析。然后,它遍历零售商信息,打印每个零售商的 h2 标签文本,并检查是否存在名为 'Afghanistan' 和 'Hong Kong' 的零售商。

英文:

I am trying to get the country names, market names and the url's from the given website below by using BeautifulSoup library. I am trying to get countrynames by a for loop but it only gives me the first one.

I was expecting it to iterate through all countries but it does not do that.

  1. from bs4 import BeautifulSoup
  2. import requests
  3. html_text = requests.get('https://www.freshplaza.com/europe/content/retailers/').text
  4. soup = BeautifulSoup(html_text, 'lxml')
  5. retailer_links = soup.find_all('div', {'id':'retailers'} )
  6. for el in retailer_links:
  7. print(el.h2.text)
  8. for el in retailer_links:
  9. if el.h2.text == 'Afghanistan':
  10. print(True)
  11. else:
  12. print(False)
  13. for el in retailer_links:
  14. if el.h2.text == 'Hong Kong':
  15. print(True)
  16. else:
  17. print(False)

答案1

得分: 0

el.h2 意味着元素的 h2 属性,而不是 DIV 元素内的 <h2> 元素。

您可以使用 soup.select() 获取 retailers div 内的所有 h2 元素:

  1. retailer_links = soup.select("div#retailers h2")
  2. for el in retailer_links:
  3. text = el.get_text()
  4. if text == 'Afghanistan':
  5. print('Afghanistan found')
  6. elif text == 'Hong Kong':
  7. print('Hong Kong found')
英文:

el.h2 means the h2 attribute of the element, not the <h2> elements that are inside the DIV.

You can use soup.select() to get all the h2 elements inside the retailers div:

  1. retailer_links = soup.select("div#retailers h2")
  2. for el in retailer_links:
  3. text = el.get_text()
  4. if text == 'Afghanistan':
  5. print('Afghanistan found')
  6. elif text == 'Hong Kong'
  7. print('Hong Kong found')

huangapple
  • 本文由 发表于 2023年2月6日 05:08:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/75355518.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定