英文:
Missing text in output when scraping with Beautiful Soup - how do I extract it?
问题
我目前正在进行一个个人项目,对于网页抓取和Beautiful Soups库我还比较新,所以非常感谢任何帮助!我目前正试图从以下HTML片段中提取R1、R2等文本。
以下是我为此编写的代码:
import requests
from bs4 import BeautifulSoup
URL1 = "https://www.sportsbet.com.au/racing-schedule/horse/today"
racing = requests.get(URL1)
soup2 = BeautifulSoup(racing.content, "lxml")
race_index = soup2.findAll('div', {"class":"tableHeaderCell_fh883o"})
for race in race_index:
print(race)
然而,显然在div标签内有一些文本,但我得到的输出是:
<div class="tableHeaderCell_fh883o"></div>
<div class="tableHeaderCell_fh883o"></div>
<div class="tableHeaderCell_fh883o"></div>
我想知道为什么div标签内的文本丢失了,以及如何提取文本。
英文:
I am currently doing a personal project, and am quite new to web scraping and the Beautiful Soups library, so any help would be much appreciated!
I am currently trying to extract the R1, R2 etc text from the following HTML snippet
The code I've written for this is below:
import requests
from bs4 import BeautifulSoup
URL1 = "https://www.sportsbet.com.au/racing-schedule/horse/today"
racing = requests.get(URL1)
soup2 = BeautifulSoup(racing.content, "lxml")
race_index = soup2.findAll('div', {"class":"tableHeaderCell_fh883o"})
for race in race_index:
print(race)
However, there is clearly some text within the div tags, but the output I am getting is:
<div class="tableHeaderCell_fh883o"></div>
<div class="tableHeaderCell_fh883o"></div>
<div class="tableHeaderCell_fh883o"></div>
I am wondering why the text within the div tags are missing, and how I can extract the text.
答案1
得分: 0
"是的,你无法获取它,因为这些数据是动态加载的,而不是静态的,所以用BeautifulSoup打开它不会加载这些数据。
相反,如果你在浏览器中打开页面并打开开发者工具,切换到网络选项卡,然后刷新页面,你会发现正在发起的请求。
长话短说,只需前往该链接,你将在那里找到所需的数据以JSON格式加载。
请不要忘记将此解决方案标记为答案,如果解决了你的问题。"
英文:
yes you can't get it because this data is dynamically loaded not static so opening it with BeautifulSoup won't load this data.
Instead, if you open the page in your browser and open DevTools, switch to the network tab then refresh the page you will find this request being made.
So long story short, just head to that link and you will find your desired data loaded there as JSON data.
Please don't forget to mark this solution as an answer if it resolves your problem.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论