2023年6月19日 00:13:47go评论110阅读模式

英文:

Why is it that the url I use changes whenever I run the code but when I paste it manually it works fine?

问题

我试图使用Python获取3只股票的季度收入数据。当我运行我的代码时，它会使用这个起始网址 `https://stockanalysis.com/stocks/tsla/financials/?p=quarterly`，但最终会搜索这个网址：`https://stockanalysis.com/stocks/tsla/financials/`。它会漏掉 `?p=quarterly` 部分。但是当我手动复制并粘贴时，它就可以正常工作。我完全不知道出了什么问题，我尝试了我所知道的一切来解决这个问题。无论我怎么做，我总是得到年度数据，而不是季度数据。有人有任何建议吗？谢谢！
我尝试使用用户代理来规避任何机器人阻止器，甚至尝试将所有网址保存到一个文本文件中，然后为我想要的特定股票调用它们。什么都不起作用，我总是得到年度数据，而不是季度数据。

英文:

I am trying to get quarterly revenue data for the 3 stocks using python. When I run my code it takes this starting url https://stockanalysis.com/stocks/tsla/financials/?p=quarterly and ends up searching this url:https://stockanalysis.com/stocks/tsla/financials/. This gets left out ?p=quarterly. But when I copy and paste it manually it works fine. I have no idea what is going wrong and I have tried everything in my knowledge to work this problem. No matter what I always get the annual data instead of the quarterly. Does anyone have any suggestions? Thank you!

import requests
from bs4 import BeautifulSoup
import pandas as pd
tickers = [&quot;AMZN&quot;, &quot;FB&quot;, &quot;TSLA&quot;]
headers = {
    &#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3&#39;
}
for ticker in tickers:
    # construct the URL directly as a string
    url = f&quot;https://stockanalysis.com/stocks/{ticker}/financials/?p=quarterly&quot;
    print(url)
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, &#39;html.parser&#39;)
    table = soup.find(&#39;table&#39;)
    # Convert the table to a DataFrame
    df = pd.read_html(str(table))[0]
    # Get the first three date columns
    columns = df.columns[1:4]
    # Find the row that contains &#39;Revenue&#39;
    row = df[df.iloc[:,0].str.contains(&#39;Revenue&#39;)]
    # Get the values for the last 3 quarters
    values = row[columns].values[0]
    print(f&quot;For {ticker}:&quot;)
    print(f&quot;Revenue for last quarter ({columns[0]}) is: {values[0]}&quot;)
    print(f&quot;Revenue for 2 quarters ago ({columns[1]}) is: {values[1]}&quot;)
    print(f&quot;Revenue for 3 quarters ago ({columns[2]}) is: {values[2]}\n&quot;)

I tried using user agents to go around any bot blockers, I even tried saving all the urls to a text file and calling them for specific stocks I want. Nothing works. I always get the annual data instead of the quarterly.

答案1

得分: 1

这些代码片段应该可以正常工作，并根据要求获取季度值，需要两个更改-

1. 将FB重命名为META

2. 将每个股票代码转换为小写，参见注释

import requests
from bs4 import BeautifulSoup
import pandas as pd
tickers = ["AMZN", "META", "TSLA"]  # 更改：将FB重命名为META
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
for ticker in tickers:
    ticker = ticker.lower()  # 更改：将股票代码转换为小写
    # 直接构建URL字符串
    url = f"https://stockanalysis.com/stocks/{ticker}/financials/?p=quarterly"
    print(url)
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, 'html.parser')
    table = soup.find('table')
    # 将表格转换为DataFrame
    df = pd.read_html(str(table))[0]
    # 获取前三个日期列
    columns = df.columns[1:4]
    # 找到包含'Revenue'的行
    row = df[df.iloc[:, 0].str.contains('Revenue')]
    # 获取最近3个季度的值
    values = row[columns].values[0]
    print(f"For {ticker}:")
    print(f"上个季度的营收 ({columns[0]}) 为: {values[0]}")
    print(f"2个季度前的营收 ({columns[1]}) 为: {values[1]}")
    print(f"3个季度前的营收 ({columns[2]}) 为: {values[2]}\n")

英文:

These code snippets should work fine and get the quarterly values as per the requirements, Two changes required-

1. Rename FB to META

2. Lowercase each ticker, see the comments

import requests
from bs4 import BeautifulSoup
import pandas as pd
tickers = [&quot;AMZN&quot;, &quot;META&quot;, &quot;TSLA&quot;] # changes: Rename FB to META
headers = {
    &#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3&#39;
}
for ticker in tickers:
    ticker = ticker.lower() # Changes: Make ticker to lowercase
    # construct the URL directly as a string
    url = f&quot;https://stockanalysis.com/stocks/{ticker}/financials/?p=quarterly&quot;
    print(url)
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, &#39;html.parser&#39;)
    table = soup.find(&#39;table&#39;)
    # Convert the table to a DataFrame
    df = pd.read_html(str(table))[0]
    # Get the first three date columns
    columns = df.columns[1:4]
    # Find the row that contains &#39;Revenue&#39;
    row = df[df.iloc[:,0].str.contains(&#39;Revenue&#39;)]
    # Get the values for the last 3 quarters
    values = row[columns].values[0]
    print(f&quot;For {ticker}:&quot;)
    print(f&quot;Revenue for last quarter ({columns[0]}) is: {values[0]}&quot;)
    print(f&quot;Revenue for 2 quarters ago ({columns[1]}) is: {values[1]}&quot;)
    print(f&quot;Revenue for 3 quarters ago ({columns[2]}) is: {values[2]}\n&quot;)

答案2

得分: 0

url = f"https://stockanalysis.com/stocks/{ticker.lower()}/financials/?p=quarterly"

如果您注意到，当手动导航到链接https://stockanalysis.com/stocks/AMZN/financials/?p=quarterly时，您会被重定向到https://stockanalysis.com/stocks/amzn/financials（开发者的解决方法或其他什么），因此您需要在URL请求中使用小写以避免丢失“q”参数。

而且，您不需要为您可能正在使用的任何内容手动指定标题。

英文:

url = f&quot;https://stockanalysis.com/stocks/{ticker.lower()}/financials/?p=quarterly&quot;

If you pay attention, when manually navigating to the link https://stockanalysis.com/stocks/AMZN/financials/?p=quarterly, you get redirected to https://stockanalysis.com/stocks/amzn/financials (developer workaround or who knows what else), so you need to use lowercase in the URL request to avoid losing the "q" parameter.

And you don't need to manually specify a header for whatever you may be using it for.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Why is it that the url I use changes whenever I run the code but when I paste it manually it works fine?

问题

答案1

答案2

训练BARTForSequenceClassification返回的数据具有不一致的维度。

SymPy解决容易方程的零空间时出现问题。

停止Python循环（列表）

正则表达式 – 匹配不带尾随第二位小数和数字的部分

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。