英文:
Removing tags from text with BeautifulSoup
问题
Sure, here's the translated code portion:
我有这段代码来从NightBot的频道页面提取歌曲名称:
```python
import urllib.request
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\gabri\AppData\Local\Programs\Python\Python38-32\geckodriver.exe')
driver.get('https://nightbot.tv/t/tonyxzero/song_requests')
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
list_item = soup.select("h4 > strong.ng-binding")
print(list_item)
name = list_item.text.strip()
print(name)
但是当我运行它时,显示了类似以下内容:
[<strong class="ng-binding">Jamiroquai - Virtual Insanity (Official Video)<!-- ngIf: currentSong.track.artist --><span class="ng-binding ng-scope" ng-if="currentSong.track.artist" style=""></span><!-- end ngIf: currentSong.track.artist --></strong>]
然后出现了以下错误:
AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
是否有另一种方法只显示文本而不包括标签?
<details>
<summary>英文:</summary>
I've this code to extract a song name from NightBot's channel page:
import urllib.request
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\gabri\AppData\Local\Programs\Python\Python38-32\geckodriver.exe')
driver.get ('https://nightbot.tv/t/tonyxzero/song_requests')
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
list_item=soup.select("h4 > strong.ng-binding")
print (list_item)
name = list_item.text.strip()
print (name)
But when i run it, shows me something like this:
[<strong class="ng-binding">Jamiroquai - Virtual Insanity (Official Video)<!-- ngIf: currentSong.track.artist --><span class="ng-binding ng-scope" ng-if="currentSong.track.artist" style=""> — JamiroquaiVEVO</span><!-- end ngIf: currentSong.track.artist --></strong>]
And them this:
```AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?```
Theres another way to just show the text without the tags?
</details>
# 答案1
**得分**: 1
```python
import urllib.request
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\gabri\AppData\Local\Programs\Python\Python38-32\geckodriver.exe')
driver.get('https://nightbot.tv/t/tonyxzero/song_requests')
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
name = soup.find('strong', {'class': 'ng-binding'}).text
#print (list_item)
#name = list_item.text.strip()
print(name)
英文:
import urllib.request
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\gabri\AppData\Local\Programs\Python\Python38-32\geckodriver.exe')
driver.get ('https://nightbot.tv/t/tonyxzero/song_requests')
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
name=soup.find('strong',{'class':'ng-binding'}).text
#print (list_item)
#name = list_item.text.strip()
print (name)
答案2
得分: 1
soup.select()
返回元素列表而不是元素本身。要获取每个元素的值,您需要进行迭代。
list_item = soup.select("h4 > strong.ng-binding")
print(list_item)
for item in list_item:
name = item.text.strip()
print(name)
英文:
soup.select()
returns list of elements to not the element.To get each element value you need to iterate.
list_item=soup.select("h4 > strong.ng-binding")
print (list_item)
for item in list_item:
name = item.text.strip()
print (name)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论