英文:
How to extract multiple text elements from a HTML class using Beautiful Soup
问题
html_code.find_all('a')[1].text, html_code.find_all('a')[2].text, html_code.find_all('a')[3].text, html_code.find_all('a')[4].text
英文:
This is the sample HTML code (from imdb.com) I want to extract text elements from:
<p class="">
Director:
<a href="/name/nm0001104/">Frank Darabont</a>
<span class="ghost">|</span>
Stars:
<a href="/name/nm0000209/">Tim Robbins</a>,
<a href="/name/nm0000151/">Morgan Freeman</a>,
<a href="/name/nm0348409/">Bob Gunton</a>,
<a href="/name/nm0006669/">William Sadler</a>
</p>
From it, I can extract the director, but can't seem to do that for the stars.
I am extracting the director with this:
<html_code>.find('a').text
How can I extract the names of the actors (Tim Robbins, Morgan Freeman, Bob Gunton, William Sadler) using similar syntax?
A beginner in BeautifulSoup thank you!
答案1
得分: 1
假设HTML保持一致,您可以使用 find_all
替代:
director, *cast = <html_code>.find_all('a')
print("导演:", director.text)
print("演员:")
for actor in cast:
print(actor.text)
英文:
Assuming the HTML is consistent, you can use find_all
instead:
director, *cast = <html_code>.find_all('a')
print("Director:", director.text)
print("Cast:")
for actor in cast:
print(actor.text)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论