英文:
Web scraping SEC filings
问题
我正在从SEC edgar网站上爬取10Q文件。
这是链接:https://www.sec.gov/Archives/edgar/data/1652044/000165204419000032/goog10-qq32019.htm
如果我们检查它,您可以找到:
我需要提取地址"1600 Amphitheatre Parkway",但不使用id。以下是一个使用id标签提取文本的代码片段。然而,我需要使用name标签。
from requests_html import HTMLSession
from bs4 import BeautifulSoup
session = HTMLSession()
page = session.get('https://www.sec.gov/Archives/edgar/data/1652044/000165204419000032/goog10-qq32019.htm')
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.find(name='d92517213e644-wk-Fact-0B11263160365DBABCF89969352EE602')
print(content.text)
我想使用name标签而不是id标签。然而,我无法使用name标签提取信息。请帮助。
查看HTML信息:
如何使用name标签而不是id标签提取内容。
谢谢。
英文:
I am working on web scraping 10Q documents from SEC edgar.
This is the url link: https://www.sec.gov/Archives/edgar/data/1652044/000165204419000032/goog10-qq32019.htm
I need to extract 1600 Amphitheatre Parkway without using id. Below is a code snippet to extract text using id tag. However I need to se name tag.
from requests_html import HTMLSession
from bs4 import BeautifulSoup
session = HTMLSession()
page = session.get('https://www.sec.gov/Archives/edgar/data/1652044/000165204419000032/goog10-qq32019.htm')
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.find(id="d92517213e644-wk-Fact-0B11263160365DBABCF89969352EE602")
print(content.text)
Instead of id tag, I would like to use name tag. However I am not able to extract information sing name tag. Please help.
see the html information :
How to use name tag instead of id tag to extract the contents.
Thanks
答案1
得分: 1
你可以像这样根据属性值查找元素:
soup.find('html_tag', {"attribute": "value"})
所以在你的情况中,name
属性存在于 ix:nonnumeric
标签上:
content = soup.find('ix:nonnumeric', {"name": "dei:EntityAddressAddressLine1"})
英文:
You can find elements based on attribute values like this
soup.find('html_tag',{"attribute":"value"})
So in your case, name
attribute exists on ix:nonnumeric
tag
content = soup.find('ix:nonnumeric',{"name":"dei:EntityAddressAddressLine1"})
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论