英文:
how to web scrape data from graph
问题
你好,我想从这个互联网页面上进行数据抓取,尤其是历史数据的图表(这里和这里)。
也许有人可以帮助我如何继续进行?更重要的是,我们如何在哪里找到这些数据。
英文:
hi I would like to web scrape data from this internet page especially the graph of historical data (Here and Here)
Maybe someone can help me how to proceed ? and more than that how can we do where and how to find the data.
答案1
得分: 1
以下是您要翻译的代码部分:
import json
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.quantalys.com/Fonds/Historique/19801"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = soup.select_one("[data-chartconfig]")["value"]
data = json.loads(data)
df = pd.DataFrame(data["dataProvider"])
df.columns = ["Date"] + [
t["balloonText"].split(":", maxsplit=1)[-1].strip() for t in data["graphs"]
]
print(df.head())
Prints:
Date Amundi Euro High Yield Bond A EUR AD Oblig. Europe Ht Rendt ICE BofA European Currency High Yield Index
0 2020-06-19 100.00 100.00 100.00
1 2020-06-20 100.00 100.00 100.00
2 2020-06-21 100.00 100.00 100.00
3 2020-06-22 99.07 99.78 99.80
4 2020-06-23 99.07 99.85 99.85
如果您需要更多帮助,请告诉我。
英文:
The data for the graph is stored inside the HTML document in Json form. To parse it you can use next example:
import json
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.quantalys.com/Fonds/Historique/19801"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = soup.select_one("[data-chartconfig]")["value"]
data = json.loads(data)
df = pd.DataFrame(data["dataProvider"])
df.columns = ["Date"] + [
t["balloonText"].split(":", maxsplit=1)[-1].strip() for t in data["graphs"]
]
print(df.head())
Prints:
Date Amundi Euro High Yield Bond A EUR AD Oblig. Europe Ht Rendt ICE BofA European Currency High Yield Index
0 2020-06-19 100.00 100.00 100.00
1 2020-06-20 100.00 100.00 100.00
2 2020-06-21 100.00 100.00 100.00
3 2020-06-22 99.07 99.78 99.80
4 2020-06-23 99.07 99.85 99.85
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论