如何从图表中网页抓取数据

huangapple go评论126阅读模式
英文:

how to web scrape data from graph

问题

你好,我想从这个互联网页面上进行数据抓取,尤其是历史数据的图表(这里这里)。

也许有人可以帮助我如何继续进行?更重要的是,我们如何在哪里找到这些数据。

英文:

hi I would like to web scrape data from this internet page especially the graph of historical data (Here and Here)

Maybe someone can help me how to proceed ? and more than that how can we do where and how to find the data.

答案1

得分: 1

以下是您要翻译的代码部分:

  1. import json
  2. import requests
  3. import pandas as pd
  4. from bs4 import BeautifulSoup
  5. url = "https://www.quantalys.com/Fonds/Historique/19801"
  6. soup = BeautifulSoup(requests.get(url).content, "html.parser")
  7. data = soup.select_one("[data-chartconfig]")["value"]
  8. data = json.loads(data)
  9. df = pd.DataFrame(data["dataProvider"])
  10. df.columns = ["Date"] + [
  11. t["balloonText"].split(":", maxsplit=1)[-1].strip() for t in data["graphs"]
  12. ]
  13. print(df.head())

Prints:

  1. Date Amundi Euro High Yield Bond A EUR AD Oblig. Europe Ht Rendt ICE BofA European Currency High Yield Index
  2. 0 2020-06-19 100.00 100.00 100.00
  3. 1 2020-06-20 100.00 100.00 100.00
  4. 2 2020-06-21 100.00 100.00 100.00
  5. 3 2020-06-22 99.07 99.78 99.80
  6. 4 2020-06-23 99.07 99.85 99.85

如果您需要更多帮助,请告诉我。

英文:

The data for the graph is stored inside the HTML document in Json form. To parse it you can use next example:

  1. import json
  2. import requests
  3. import pandas as pd
  4. from bs4 import BeautifulSoup
  5. url = "https://www.quantalys.com/Fonds/Historique/19801"
  6. soup = BeautifulSoup(requests.get(url).content, "html.parser")
  7. data = soup.select_one("[data-chartconfig]")["value"]
  8. data = json.loads(data)
  9. df = pd.DataFrame(data["dataProvider"])
  10. df.columns = ["Date"] + [
  11. t["balloonText"].split(":", maxsplit=1)[-1].strip() for t in data["graphs"]
  12. ]
  13. print(df.head())

Prints:

  1. Date Amundi Euro High Yield Bond A EUR AD Oblig. Europe Ht Rendt ICE BofA European Currency High Yield Index
  2. 0 2020-06-19 100.00 100.00 100.00
  3. 1 2020-06-20 100.00 100.00 100.00
  4. 2 2020-06-21 100.00 100.00 100.00
  5. 3 2020-06-22 99.07 99.78 99.80
  6. 4 2020-06-23 99.07 99.85 99.85

huangapple
  • 本文由 发表于 2023年6月22日 04:25:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76526893.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定