英文:
Find_all not not finding all clases
问题
我编写了这段代码来查找所有公司链接,但它只找到前两个,然后停止了。有什么想法为什么以及如何更改它?
import requests
from bs4 import BeautifulSoup
url = "https://www.gelbeseiten.de/branchen/rechtsanwalt/mannheim"
req = requests.get(url)
src = req.text
soup = BeautifulSoup(src, "lxml")
all_firmas = soup.find_all("article", class_="mod mod-Treffer")
for i in all_firmas:
i_2 = i.next_element.next_element
print(i_2.get("href"))
print("Category done!")
<details>
<summary>英文:</summary>
I wrote this code to find all firms links, but it finds only first two, then it stops. Any idea why and how can I change it?
import requests
from bs4 import BeautifulSoup
url = "https://www.gelbeseiten.de/branchen/rechtsanwalt/mannheim"
req = requests.get(url)
src = req.text
soup = BeautifulSoup(src, "lxml")
all_firmas = soup.find_all("article", class_="mod mod-Treffer")
for i in all_firmas:
i_2 = i.next_element.next_element
print(i_2.get("href"))
print("Category done!")
</details>
# 答案1
**得分**: 0
以下是已翻译的内容:
根据您的链接,只有两篇文章具有类别“mod mod-Treffer”。其他文章具有类别“mod mod-Treffer mod-Treffer--kurz”。
以下代码还可以使用正则表达式 (`import re`) 获取其他文章。
```python
all_firmas = soup.find_all("article", class_=re.compile("mod mod-Treffer.+"))
```
<details>
<summary>英文:</summary>
Following your link, only two articles have the class "mod mod-Treffer". The other articles have the class "mod mod-Treffer mod-Treffer--kurz"
The following code also get the other articles using regex (`import re`).
```python
all_firmas = soup.find_all("article", class_=re.compile("mod mod-Treffer.+"))
```
</details>
# 答案2
**得分**: 0
Using one class works, since all the articles have the mod-Treffer and mod is also applied to other elements you can just find with mod-Treffer like this
all_firmas = soup.find_all("article", class_="mod-Treffer")
To be more specific you can go with
all_firmas = soup.find("div", id="gs_treffer").find_all("article", class_="mod-Treffer")
<details>
<summary>英文:</summary>
Using one class works, since all the articles have the mod-Treffer and mod is also applied to other elements you can just find with mod-Treffer like this
all_firmas = soup.find_all("article", class_="mod-Treffer")
To be more specific you can go with
all_firmas = soup.find("div", id="gs_treffer").find_all("article", class_="mod-Treffer")
</details>
# 答案3
**得分**: 0
你可以只使用CSS选择器与 [`select`](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class)。 这类似于 `find_all`。
```
all_firmas = soup.select("article.mod.mod-Treffer")
for i in all_firmas:
print(i.a["href"])
```
<details>
<summary>英文:</summary>
You can use just use [`select`](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class) with CSS Selector. It is similar to `find_all`.
```
all_firmas = soup.select("article.mod.mod-Treffer")
for i in all_firmas:
print(i.a["href"])
```
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论