如何在循环中将不同数据框的列相加?

huangapple go评论71阅读模式
英文:

How can I add the columns different dataframes in a loop?

问题

我目前正在处理一个有趣的项目的财务数据,并希望以以下方式向现有数据框中添加列:我遍历一个指标名称列表,并为每个不同的指标进行API调用,将数据接收为CSV格式。然后,我希望将不包含时间列(因为每个指标的时间列相同)的列追加到包含价格数据的数据框中。到目前为止,我发现以下代码似乎存在问题,因为之前的代码可以正常工作。

inditest = ['SMA', 'EMA', 'WMA']
df = pd.DataFrame()
for i in inditest: 
    indicator_url = f"https://www.alphavantage.co/query?function={i}&symbol={SYMBOL}&interval=15min&time_period=60&series_type=close&datatype=csv&apikey={API_KEY}"
    
    # 从API获取指标数据 
    data = requests.get(indicator_url)
    with open(location + f"{i}_{SYMBOL}_base.csv", 'wb') as fi: 
      fi.write(data.content)

    # 为指标数据创建列 
    idf = pd.read_csv(location +  f"{i}_{SYMBOL}_base.csv")
    idata = idf.iloc[:, 1:len(idf.columns)]

    # 将指标列添加到现有数据框
    df = pd.concat([df, idata], axis=1, join="inner")

我成功地从API中获取到数据,所以问题应该在于添加列的部分。为了简化,我创建了一个空数据框。在原始代码中,数据框在循环外部被定义,并且有2列。在我的情况下,SYMBOL变量是苹果公司(AAPL)的股票。提前感谢您的帮助!

英文:

I am currently working with financial data for a fun project and want to add columns to an existing dataframe in the following way: I loop through a list of indicator names and make an API call for each of the different indicators recieving the data as csv. I then want to take the columns without the time column (since it is the same for each indicator) and append them to a dataframe with price data. So far I found out that the following code seems to have the problem since the code before works as it should.

inditest = ['SMA', 'EMA', 'WMA']
df = pd.DataFrame()
for i in inditest: 
    indicator_url = f"https://www.alphavantage.co/query?function={i}&symbol={SYMBOL}&interval=15min&time_period=60&series_type=close&datatype=csv&apikey={API_KEY}"
    
    # get indicator data from api 
    data = requests.get(indicator_url)
    with open(location + f"{i}_{SYMBOL}_base.csv", 'wb') as fi: 
      fi.write(data.content)

    # make columns for indicator data 
    idf = pd.read_csv(location +  f"{i}_{SYMBOL}_base.csv")
    idata = idf.iloc[:, 1:len(idf.columns)]

    # add indicator columns to existing dataframe
    df = pd.concat([df, idata], axis=1, join="inner")

I receive the data from the api just fine so it should be a problem with adding the columns.

To simplify I made an empty Dataframe. In the original df is defined outside of the loop and has 2 columns. SYMBOL variable in my case is AAPL for the apple stock.

Thanks in advance!

答案1

得分: 1

I'll provide the translation for the code portion:

from urllib.parse import urlencode

base_url = "https://www.alphavantage.co/query?"

# 所有你的URL请求都共享这些参数
base_params = {
    "symbol": "MSFT",  # 你需要的股票符号
    "interval": "15min",
    "time_period": 60,
    "series_type": "close",
    "datatype": "csv",
    "apikey": "你的API密钥",
}

inditest = ["SMA", "EMA", "WMA"]
indicators = [
    # `urlencode` 会构建 "symbol=AAPL&interval=15min&..." 字符串
    # 我们只需要将其前缀设为 `base_url` 并添加 `function`。
    # 我们也将 `time` 移到索引,以便稍后 `pd.concat` 可以对齐数据帧。
    pd.read_csv(base_url + urlencode({**base_params, "function": i}), index_col="time")
    for i in inditest
]

result = pd.concat(indicators, axis=1)

Please note that I've removed the HTML entities (") for double quotes to make the code more readable.

英文:

Two things:

  • Move time to the index so pd.concat can line up the series correctly. Do not assume that you will receive the same data or in the same order for all series.
  • pd.read_csv can read from a URL so no need to save it locally.
from urllib.parse import urlencode

base_url = "https://www.alphavantage.co/query?"

# All your URL requests share these parameters
base_params = {
    "symbol": "MSFT", # whatever symbol you are after
    "interval": "15min",
    "time_period": 60,
    "series_type": "close",
    "datatype": "csv",
    "apikey": "...",
}

inditest = ["SMA", "EMA", "WMA"]
indicators = [
    # `urlencode` will build the "symbol=AAPL&interval=15min&..." string
    # We only need to prefix it with the `base_url` and add `function`.
    # We also move `time` to the index so `pd.concat` can later align the dataframes.
    pd.read_csv(base_url + urlencode({**base_params, "function": i}), index_col="time")
    for i in inditest
]

result = pd.concat(indicators, axis=1)

huangapple
  • 本文由 发表于 2023年5月13日 21:22:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76242950.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定