2023年2月6日 05:08:52go评论92阅读模式

英文:

Trying to run a for loop on a html element while using bs4 but it does not iterate

问题

from bs4 import BeautifulSoup
import requests
# 发送 HTTP 请求并获取页面内容
html_text = requests.get('https://www.freshplaza.com/europe/content/retailers/').text
# 使用 BeautifulSoup 解析页面
soup = BeautifulSoup(html_text, 'lxml')
# 找到包含零售商信息的 div 元素
retailer_links = soup.find_all('div', {'id': 'retailers'})
# 遍历零售商信息并打印 h2 标签的文本
for el in retailer_links:
    print(el.h2.text)
# 检查是否有零售商名称为 'Afghanistan'
for el in retailer_links:
    if el.h2.text == 'Afghanistan':
        print(True)
    else:
        print(False)
# 检查是否有零售商名称为 'Hong Kong'
for el in retailer_links:
    if el.h2.text == 'Hong Kong':
        print(True)
    else:
        print(False)

这段代码尝试使用 BeautifulSoup 库从给定的网站中获取国家名称、市场名称和链接。它首先发送 HTTP 请求以获取页面内容，然后使用 BeautifulSoup 进行解析。然后，它遍历零售商信息，打印每个零售商的 h2 标签文本，并检查是否存在名为 'Afghanistan' 和 'Hong Kong' 的零售商。

英文:

I am trying to get the country names, market names and the url's from the given website below by using BeautifulSoup library. I am trying to get countrynames by a for loop but it only gives me the first one.

I was expecting it to iterate through all countries but it does not do that.

from bs4 import BeautifulSoup
import requests
html_text = requests.get(&#39;https://www.freshplaza.com/europe/content/retailers/&#39;).text
soup = BeautifulSoup(html_text, &#39;lxml&#39;)
retailer_links = soup.find_all(&#39;div&#39;, {&#39;id&#39;:&#39;retailers&#39;} )
for el in retailer_links:
    print(el.h2.text)
for el in retailer_links:
    if el.h2.text == &#39;Afghanistan&#39;:
        print(True)
    else: 
        print(False)
for el in retailer_links:
    if el.h2.text == &#39;Hong Kong&#39;:
        print(True)
    else: 
        print(False)

答案1

得分: 0

el.h2 意味着元素的 h2 属性，而不是 DIV 元素内的 <h2> 元素。

您可以使用 soup.select() 获取 retailers div 内的所有 h2 元素：

retailer_links = soup.select("div#retailers h2")
for el in retailer_links:
    text = el.get_text()
    if text == 'Afghanistan':
        print('Afghanistan found')
    elif text == 'Hong Kong':
        print('Hong Kong found')

英文:

el.h2 means the h2 attribute of the element, not the <h2> elements that are inside the DIV.

You can use soup.select() to get all the h2 elements inside the retailers div:

retailer_links = soup.select(&quot;div#retailers h2&quot;)
for el in retailer_links:
    text = el.get_text()
    if text == &#39;Afghanistan&#39;:
        print(&#39;Afghanistan found&#39;)
    elif text == &#39;Hong Kong&#39;
        print(&#39;Hong Kong found&#39;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

尝试在 HTML 元素上使用 BeautifulSoup4 运行 for 循环，但它不会迭代。

问题

答案1

JavaScript中闪烁的背景颜色

我运行我的代码时，在Python中没有得到我想要的直方图。

Why is my script reading in files linked in my HTML which aren't being specified when reading in with ioutil.ReadFile() in GoLang?

当您点击复选框时，如何使用JavaScript确保输入字段获取数字0？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。