从一个具有累积回报列的数据框中计算月度回报

huangapple go评论65阅读模式
英文:

Calculating monthly return from a dataframe that has cumulative return down the column

问题

以下是要翻译的内容:

展示 A:

场景 时间步长 累积回报
1 0 1
1 1 1.05
1 2 1.07
2 0 1
2 1 1.04
2 2 1.02

展示 B:

场景 时间步长 累积回报
1 0 .05
1 1 .019
1 2 na
2 0 .04
2 1 -.019
2 2 na

其中第 n 行是每次时间变化时的 (n-1 / n)-1,直到每个场景的末尾。

最终目标是对累积指数进行迭代操作,然后从各个回报中创建线性组合,然后重新滚动到新的累积数。我正在使用 pandas 完成这个任务。

到目前为止,我有以下代码:

for idx,wght, i in zip( indeces,fundweights , range(len(indeces)-1)):
    df_temp = pd.read_csv(filepath + file_names[idx],header=None)
    df_temp = df_temp.stack()
    df_temp.index.names=['Scen','TIME']
    df_temp= df_temp.to_frame(name='CumGrowth_idx')
    df_temp.reset_index(inplace=True)
    df_temp['Scen'] = df_temp['Scen'] + 1 
    df_temp[idx] = idx
    df_temp = df_temp.rename(columns={'TIME':'Month'})
    df_temp["Monthly_Return"]= 

我想要创建月度回报数据框,然后将这些数据框全部附加到一个数据框中,进行线性组合。有人能提供一些建议吗?谢谢!

英文:

I have the following dataframe. How do I create a new column that has the monthly return?

Exhibit A:

Scenario TimeStep CumReturn
1 0 1
1 1 1.05
1 2 1.07
2 0 1
2 1 1.04
2 2 1.02

Exhibit B:

Scenario TimeStep CumReturn
1 0 .05
1 1 .019
1 2 na
2 0 .04
2 1 -.019
2 2 na

Where row n is (n-1 / n)-1 for each change in time, stopping at the end of each scenario.

The end goal is to do this iteratively for cumulative indeces, then create a linearly combination from the individual returns, then RE roll to a new cumulative number.

I am doing this in pandas.

So far I have this code:

  for idx,wght, i in zip( indeces,fundweights , range(len(indeces)-1)):
        df_temp = pd.read_csv(filepath + file_names[idx],header=None)
        df_temp = df_temp.stack()
        df_temp.index.names=['Scen','TIME']
        df_temp= df_temp.to_frame(name='CumGrowth_idx')
        df_temp.reset_index(inplace=True)
        df_temp['Scen'] = df_temp['Scen'] + 1 
        df_temp[idx] = idx
        df_temp = df_temp.rename(columns={'TIME':'Month'})
        df_temp["Monthly_Return"]= 

I want to create the monthly return dataframe, then append these all to a dataframe, take a linear combination. Can anyone offer some input?

Thanks

答案1

得分: 1

以下是翻译好的部分:

虽然使用Pandas的一些合并和分组方法可能有更加优雅的方式来完成这个任务,但这里提供一种可行的方法。

首先,定义一个函数来根据“Scenario Indexs”(场景索引)拆分数据框,然后使用shift函数更新“cumReturn”列,最后将单独的数据框连接回一个数据框,如下所示:

import pandas as pd

def computeDifference(df: pd.DataFrame) -> pd.DataFrame:
    scnslist = df['Scenario'].unique()
    rslt = df[df['Scenario'] == scnslist[0]].copy()
    rslt['CumReturn'] = (rslt['CumReturn'].shift(-1) / rslt['CumReturn']) - 1
    for i in range(1, len(scnslist)):
        dx = df[df['Scenario'] == scnslist[i]].copy()
        dx['CumReturn'] = (dx['CumReturn'].shift(-1) / dx['CumReturn']) - 1
        rslt = pd.concat([rslt, dx], axis=0, ignore_index=True)
    return rslt

当给定数据框 Exhibit 1,并将其定义为df时,您可以执行 computeDifference(df),得到以下结果:

    Scenario    TimeStep    CumReturn
0   1           0           0.050000
1   1           1           0.019048
2   1           2           NaN
3   2           0           0.040000
4   2           1          -0.019231
5   2           2           NaN
英文:

While there is probably a much more elegant way to accomplish this with some of Pandas' merge and group methods, here is a method that will work.

First of all define a function to split the dataframe by Scenario Indexs, then update the cumReturn column using the shift function, finally concat the separate frames back into 1 frame as follows:

import pandas as pd

def computeDifference(df: pd.DataFrame)-> pd.DataFrame:
    scnslist = df['Scenario'].unique()
    rslt = df[df['Scenario'] == scnslist[0]].copy()
    rslt['CumReturn'] = (rslt['CumReturn'].shift(-1)/rslt['CumReturn'])-1
    for i in range(1, len(scnslist)):
        dx = df[df['Scenario'] == scnslist[i]].copy()
        dx['CumReturn'] = (dx['CumReturn'].shift(-1)/dx['CumReturn'])-1
        rslt = pd.concat([rslt, dx], axis=0, ignore_index=True)
    return rslt

When given the dataframe in Exhibit 1, defined as df
you can execute computeDifference(df) which yields:

	Scenario	TimeStep	CumReturn
0	1	0	0.050000
1	1	1	0.019048
2	1	2	NaN
3	2	0	0.040000
4	2	1	-0.019231
5	2	2	NaN

答案2

得分: 0

df_temp = pd.read_csv(filepath + file_names_equity[idx], header=None)
df_temp = df_temp.stack()
df_temp.index.names = ['Scen', 'TIME']
df_temp = df_temp.to_frame(name='CumGrowth_idx')
df_temp.reset_index(inplace=True)
df_temp['Scen'] = df_temp['Scen'] + 1
df_temp[idx] = idx
df_temp = df_temp.rename(columns={'TIME': 'Month'})

df_temp["Monthly_Return"] = (df_temp["CumGrowth_idx"].div(df_temp["CumGrowth_idx"].shift(1)) - 1).shift(-1)
df_temp["Monthly_Return"] = np.where(df_temp['Month'] == 600, 0, df_temp['Monthly_Return'])

英文:
 df_temp = pd.read_csv(filepath + file_names_equity[idx],header=None)
        df_temp = df_temp.stack()
        df_temp.index.names=['Scen','TIME']
        df_temp= df_temp.to_frame(name='CumGrowth_idx')
        df_temp.reset_index(inplace=True)
        df_temp['Scen'] = df_temp['Scen'] + 1 
        df_temp[idx] = idx
        df_temp = df_temp.rename(columns={'TIME':'Month'})
        
        df_temp["Monthly_Return"]= (df_temp["CumGrowth_idx"].div(df_temp["CumGrowth_idx"].shift(1)) - 1).shift(-1)
        df_temp["Monthly_Return"] = np.where(df_temp['Month'] == 600,0, df_temp['Monthly_Return'] )

huangapple
  • 本文由 发表于 2023年6月2日 04:37:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76385545.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定