无法从Yahoo Finance的分析选项卡中获取数据。

huangapple go评论107阅读模式
英文:

Can't fetch data from the analysis tab on Yahoo Finance

问题

我正在尝试从一个网站上抓取一些表格内容这个[网站](https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL)上的数据加载过程发生了巨大变化以前所需的数据可以在页面源代码的一些脚本标签中找到我通过开发工具查看了端点但没有在那里找到任何数据不过我不确定是否在那里漏掉了什么我对位于Revenue Estimate下的表格感兴趣以下是如何获取内容的示例代码

```python
import re
import json
import requests
from pprint import pprint

link = 'https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
}
with requests.Session() as s:
    s.headers.update(headers)
    res = s.get(link)
    data = re.findall(r'root.App.main[^{]+([\s\S].*);', res.text)[0]
    jsoncontent = json.loads(data)

    try:
        container = jsoncontent['context']['dispatcher']['stores']['QuoteSummaryStore']['earningsTrend']
    except TypeError:
        container = ""

    pprint(container)

这是示例代码,用于从网站中获取数据。


<details>
<summary>英文:</summary>

I&#39;m trying to scrape some tabular content from a website. The data loading process on this [website](https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL) has changed dramatically. Previously, the necessary data could be found within some script tags in the page source. I looked into the endpoint through dev tools but could not find any data there. I&#39;m not sure if I missed anything in there, though. I&#39;m interested in the table located under `Revenue Estimate`. This is something how I could fetch the content.

    import re
    import json
    import requests
    from pprint import pprint
    
    link = &#39;https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL&#39;
    
    headers = {
        &#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36&#39;,
    }
    with requests.Session() as s:
        s.headers.update(headers)
        res = s.get(link)
        data = re.findall(r&#39;root.App.main[^{]+([\s\S].*);&#39;,res.text)[0]
        jsoncontent = json.loads(data)
    
        # pprint(jsoncontent)
    
        try:
            container = jsoncontent[&#39;context&#39;][&#39;dispatcher&#39;][&#39;stores&#39;][&#39;QuoteSummaryStore&#39;][&#39;earningsTrend&#39;]
        except TypeError: container = &quot;&quot;
    
        pprint(container)



</details>


# 答案1
**得分**: 1

请尝试使用以下代码:

```python
import requests

headers = {
    'authority': 'finance.yahoo.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'accept-language': 'de,de-DE;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,fr;q=0.5,de-CH;q=0.4,es;q=0.3',
    'cache-control': 'no-cache',
    'dnt': '1',
    'pragma': 'no-cache',
    'sec-ch-ua': '"Not_A Brand";v="99", "Microsoft Edge";v="109", "Chromium";v="109"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78',
}

params = {
    'p': 'AAPL',
}

response = requests.get('https://finance.yahoo.com/quote/AAPL/analysis', params=params, headers=headers)

然后从 response.content 中解析所需的值。

英文:

Try using:

import requests

headers = {
    &#39;authority&#39;: &#39;finance.yahoo.com&#39;,
    &#39;accept&#39;: &#39;text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9&#39;,
    &#39;accept-language&#39;: &#39;de,de-DE;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,fr;q=0.5,de-CH;q=0.4,es;q=0.3&#39;,
    &#39;cache-control&#39;: &#39;no-cache&#39;,
    &#39;dnt&#39;: &#39;1&#39;,
    &#39;pragma&#39;: &#39;no-cache&#39;,
    &#39;sec-ch-ua&#39;: &#39;&quot;Not_A Brand&quot;;v=&quot;99&quot;, &quot;Microsoft Edge&quot;;v=&quot;109&quot;, &quot;Chromium&quot;;v=&quot;109&quot;&#39;,
    &#39;sec-ch-ua-mobile&#39;: &#39;?0&#39;,
    &#39;sec-ch-ua-platform&#39;: &#39;&quot;Windows&quot;&#39;,
    &#39;sec-fetch-dest&#39;: &#39;document&#39;,
    &#39;sec-fetch-mode&#39;: &#39;navigate&#39;,
    &#39;sec-fetch-site&#39;: &#39;same-origin&#39;,
    &#39;sec-fetch-user&#39;: &#39;?1&#39;,
    &#39;upgrade-insecure-requests&#39;: &#39;1&#39;,
    &#39;user-agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78&#39;,
}

params = {
    &#39;p&#39;: &#39;AAPL&#39;,
}

response = requests.get(&#39;https://finance.yahoo.com/quote/AAPL/analysis&#39;, params=params, headers=headers)

and efter that parse the desired values from response.content.

答案2

得分: 1

你可以使用Pandas DataFrame来获取Revenue Estimate表格数据,如下所示:

import requests
import pandas as pd

headers = {"user-agent": "Mozilla/5.0"}

res = requests.get("https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL&amp;guccounter=1", headers=headers).text
# print(res)
df = pd.read_html(res)[1]
print(df)

输出:

   Revenue Estimate Current Qtr. (Mar 2023) Next Qtr. (Jun 2023) Current Year (2023) Next Year (2024)
0  No. of Analysts                      24                   23                  39               36
1    Avg. Estimate                  93.19B               85.59B             392.39B          417.75B
2     Low Estimate                  91.81B               81.32B             378.62B          398.67B
3    High Estimate                  98.84B               90.12B             414.04B          438.76B
4   Year Ago Sales                  97.28B               82.96B             394.33B          392.39B
5  Sales Growth (year/est)                  -4.20%                3.20%              -0.50%            6.50%
英文:

You can use Pandas DataFrame to get the Revenue Estimate table data as follows:

import requests
import pandas as pd

headers= {&quot;user-agent&quot;:&quot;Mozilla/5.0&quot;}

res = requests.get(&quot;https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL&amp;guccounter=1&quot;, headers=headers).text
#print(res)
df= pd.read_html(res)[1]
print(df)

Output:

      Revenue Estimate Current Qtr. (Mar 2023) Next Qtr. (Jun 2023) Current Year (2023) Next Year (2024)
0          No. of Analysts                      24                   23                  39               36
1            Avg. Estimate                  93.19B               85.59B             392.39B          417.75B
2             Low Estimate                  91.81B               81.32B             378.62B          398.67B
3            High Estimate                  98.84B               90.12B             414.04B          438.76B
4           Year Ago Sales                  97.28B               82.96B             394.33B          392.39B
5  Sales Growth (year/est)                  -4.20%                3.20%              -0.50%            6.50%

huangapple
  • 本文由 发表于 2023年2月8日 19:39:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385266.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定