英文:
how to web scrape the data from this site?
问题
我想获取来自这个网站的8年数据(图表'Encours des parts référencées')。我不知道如何找到这些数据。我检查了网站,但没有看到它们。我想知道它们在哪里,如何获取它们以及应该使用什么API?
这是网站的图片
我尝试了以下代码:
import requests
import pandas as pd
from bs4 import BeautifulSoup
import json
但之后我不知道该怎么做。
任何帮助都将有助于我。
英文:
I would like to get the 8 years of data (graph 'Encours des parts référencées') from this site.
I don't know I can find the data. I inspect the site but don't get see them. I would like to know where they are and how to get them and what api should I use ?
Here's a image of the site
I tried:
import requests
import pandas as pd
from bs4 import BeautifulSoup
import json
but after that I don't know what to do
Any help would be helpful
答案1
得分: 1
数据以Json形式存储在页面中。要提取到pandas数据帧,您可以执行以下操作:
import json
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.quantalys.com/espace/518"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = soup.select_one("#chartEncours8a input")["value"]
data = json.loads(data)
df = pd.DataFrame(data['dataProvider'])
df['unit'] = data['valueAxes'][0]['unit']
print(df)
打印输出:
category column-1 unit
0 2015 41.02 Mrd€
1 2016 44.92 Mrd€
2 2017 43.44 Mrd€
3 2018 31.58 Mrd€
4 2019 25.30 Mrd€
5 2020 25.26 Mrd€
6 2021 25.55 Mrd€
7 2022 19.57 Mrd€
英文:
The data is stored in the page in Json form. To extract it to pandas dataframe you can do:
import json
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.quantalys.com/espace/518"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = soup.select_one("#chartEncours8a input")["value"]
data = json.loads(data)
df = pd.DataFrame(data['dataProvider'])
df['unit'] = data['valueAxes'][0]['unit']
print(df)
Prints:
category column-1 unit
0 2015 41.02 Mrd€
1 2016 44.92 Mrd€
2 2017 43.44 Mrd€
3 2018 31.58 Mrd€
4 2019 25.30 Mrd€
5 2020 25.26 Mrd€
6 2021 25.55 Mrd€
7 2022 19.57 Mrd€
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论