如何在数据框中跨行计算期望值

huangapple go评论67阅读模式
英文:

How to compute expectancy in a dataframe across rows

问题

我有一个包含日期、标识、策略和盈亏的数据框。我想以几种方式分析和比较盈亏。

我想要在按标识和策略分组时获取胜率和期望值。所以我已经这样做了:

def stats(s):
    winrate = s['isWinner']['count'] / (s['isWinner']['count'] + s['isLoser']['count'])
    expectancy = s['isWinner']['mean'] * winrate - s['isLoser']['mean'] * (1.0 - winrate)

然后我为数据框添加了“isWinner”和“isLoser”列:

df["isWinner"] = df['pnl'] >= 0
df["isLoser"] = df['pnl'] < 0

接下来,我对数据框进行分组,计算各种统计数据:

df2 = df.groupby(['day', 'symbol', 'strategy', 'isWinner']).agg({'pnl': ['count', 'mean', 'std', 'min', 'max']})

最后,我对df2再次进行分组,应用之前定义的stats函数:

df2.groupby(['day', 'symbol', 'strategy']).agg(stats)

stats函数中,您似乎有一个问题,不能像s['isWinner']这样使用。可能是因为stats函数没有返回winrateexpectancy的值。您可以在函数中添加return winrate, expectancy语句来返回这些值。然后,在调用agg函数时,您可以指定要应用的函数,并将结果添加到df2中:

df2 = df2.groupby(['day', 'symbol', 'strategy']).agg(stats)

这样,您将在df2中包含每个分组的胜率和期望值。

至于是否有更好的方法,创建df2并应用stats函数是一种常见的方法,可以帮助您对数据进行更灵活的分析。如果您想要更多的分析或可视化,可能需要保留df2。但如果您只对胜率和期望值感兴趣,您可以在df上直接计算它们,而无需创建df2。这取决于您的具体需求和数据分析的复杂性。

英文:

I have a dataframe that contains day, symbol, strategy, and pnl. I want to analyze and compare pnl in a couple of ways.

I'd like to get the win-rate & expectancy when grouped by symbol and strategy. So I've done this:

def stats(s):
    winrate = s[&#39;isWinner&#39;][&#39;count&#39;] / (s[&#39;isWinner&#39;][&#39;count&#39;] + s[&#39;isLoser&#39;][&#39;count&#39;])
    expectancy = s[&#39;isWinner&#39;][&#39;mean&#39;] * winrate - s[&#39;isLoser&#39;][&#39;mean&#39;] * (1.0 - winrate)

df[&quot;isWinner&quot;] = df[&#39;pnl&#39;] &gt;= 0
df[&quot;isLoser&quot;] = df[&#39;pnl&#39;] &lt; 0
df2 = df.groupby([&#39;day&#39;, &#39;symbol&#39;, &#39;strategy&#39;, &#39;isWinner&#39;]).agg({&#39;pnl&#39;: [&#39;count&#39;, &#39;mean&#39;, &#39;std&#39;, &#39;min&#39;, &#39;max&#39;]})
df2.groupby([&#39;day&#39;, &#39;symbol&#39;, &#39;strategy&#39;]).agg(stats)

Apparently, I can't do s[&#39;isWinner&#39;] in the stats function. What am I doing wrong?

Once the stats function works, how do I add winrate and expectancy to df2?

Am I going about this the right way? Is it necessary to create df2 from df, or is there a better way?

答案1

得分: 0

以下是代码的翻译结果:

索引 名称 获胜次数 失利次数 胜率 利润 亏损 平均盈利 平均亏损 期望值
1 跟随外部蜡烛期权筛选 15 93.0 105.0 0.4696969696969697 36898.0 -20096.0 396.752688172043 -191.3904761904762 84.85858585858587
6 外部蜡烛剥头皮期权 11.0 20.0 0.3548387096774194 3409.0 -2971.0 309.90909090909093 -148.55 14.129032258064527
0 跟随外部蜡烛期权 650.0 980.0 0.3987730061349693 200595.0 -178813.0 308.60769230769233 -182.46224489795918 13.36319018404906
10 剥头皮外部 V5 535.0 886.0 0.3764954257565095 125250.0 -108215.0 234.11214953271028 -122.13882618510158 11.988036593947925
5 外部 V4 151.0 163.0 0.48089171974522293 6257.0 -5105.0 41.437086092715234 -31.319018404907975 3.6687898089171966
4 外部 V3 110.0 172.0 0.3900709219858156 7813.0 -6852.0 71.02727272727273 -39.83720930232558 3.4078014184397176
2 外部 V1 113.0 151.0 0.42803030303030304 10498.0 -9790.0 92.90265486725664 -64.83443708609272 2.68181818181818
13 剥头皮外部 V8 607.0 729.0 0.45434131736526945 79443.0 -79559.0 130.87808896210873 -109.13443072702331 -0.08682634730538297
15 剥头皮 V2 78.0 103.0 0.430939226519337 1702.0 -1748.0 21.82051282051282 -16.97087378640777 -0.25414364640883846
3 外部 V2 81.0 117.0 0.4090909090909091 7603.0 -7773.0 93.8641975308642 -66.43589743589743 -0.8585858585858475
14 剥头皮 V1 47.0 53.0 0.47 833.0 -1402.0 17.72340425531915 -26.452830188679247 -5.69
7 剥头皮外部 V2 87.0 124.0 0.41232227488151657 18476.0 -20010.0 212.367816091954 -161.3709677419355 -7.270142180094808
8 剥头皮外部 V3 66.0 90.0 0.4230769230769231 9255.0 -10768.0 140.22727272727272 -119.64444444444445 -9.698717948717949
12 剥头皮外部 V7 27.0 52.0 0.34177215189873417 4015.0 -5784.0 148.7037037037037 -111.23076923076923 -22.392405063291136
11 剥头皮外部 V6 29.0 60.0 0.3258426966292135 6816.0 -8878.0 235.0344827586207 -147.96666666666667 -23.168539325842687
9 剥头皮外部 V4 48.0 104.0 0.3157894736842105 8015.0 -11622.0 166.97916666666666 -111.75 -23.730263157894747
英文:

I'm sure there is a more pythonic way to do this, but it works.

    df[&quot;isWinner&quot;] = df[&#39;pnl&#39;] &gt;= 0
    df[&quot;isLoser&quot;] = df[&#39;pnl&#39;] &lt; 0

    grouped = df.groupby([&#39;strategy&#39;])
    stats = pd.DataFrame()
    for name, grp in grouped:
      wincount = grp[&#39;isWinner&#39;].values.sum()
      loscount = grp[&#39;isLoser&#39;].values.sum()
      winrate = wincount / (wincount+loscount)
      profits = grp[grp[&#39;isWinner&#39;]][&#39;pnl&#39;].sum()
      losses = grp[grp[&#39;isLoser&#39;]][&#39;pnl&#39;].sum()
      avgwin = profits/wincount
      avglos = losses/loscount
      expectancy = winrate * avgwin + (1.0 - winrate) * avglos
      row = {&#39;name&#39;: name, &#39;wincount&#39;: wincount, &#39;losscount&#39;: loscount, &#39;winrate&#39;: winrate, &#39;profits&#39;: profits, &#39;losses&#39;: losses, &#39;avgwin&#39;: avgwin, &#39;avgloss&#39;: avglos, &#39;expectancy&#39;: expectancy}
      stats = stats.append(row, ignore_index=True)

    stats

Results:

index name wincount losscount winrate profits losses avgwin avgloss expectancy
1 Follow outside candles OPTIONS Filter 15 93.0 105.0 0.4696969696969697 36898.0 -20096.0 396.752688172043 -191.3904761904762 84.85858585858587
6 Scalp outside candles OPTIONS 11.0 20.0 0.3548387096774194 3409.0 -2971.0 309.90909090909093 -148.55 14.129032258064527
0 Follow outside candles OPTIONS 650.0 980.0 0.3987730061349693 200595.0 -178813.0 308.60769230769233 -182.46224489795918 13.36319018404906
10 Scalp outside v5 535.0 886.0 0.3764954257565095 125250.0 -108215.0 234.11214953271028 -122.13882618510158 11.988036593947925
5 Outside v4 151.0 163.0 0.48089171974522293 6257.0 -5105.0 41.437086092715234 -31.319018404907975 3.6687898089171966
4 Outside v3 110.0 172.0 0.3900709219858156 7813.0 -6852.0 71.02727272727273 -39.83720930232558 3.4078014184397176
2 Outside v1 113.0 151.0 0.42803030303030304 10498.0 -9790.0 92.90265486725664 -64.83443708609272 2.68181818181818
13 Scalp outside v8 607.0 729.0 0.45434131736526945 79443.0 -79559.0 130.87808896210873 -109.13443072702331 -0.08682634730538297
15 Scalp v2 78.0 103.0 0.430939226519337 1702.0 -1748.0 21.82051282051282 -16.97087378640777 -0.25414364640883846
3 Outside v2 81.0 117.0 0.4090909090909091 7603.0 -7773.0 93.8641975308642 -66.43589743589743 -0.8585858585858475
14 Scalp v1 47.0 53.0 0.47 833.0 -1402.0 17.72340425531915 -26.452830188679247 -5.690000000000001
7 Scalp outside v2 87.0 124.0 0.41232227488151657 18476.0 -20010.0 212.367816091954 -161.3709677419355 -7.270142180094808
8 Scalp outside v3 66.0 90.0 0.4230769230769231 9255.0 -10768.0 140.22727272727272 -119.64444444444445 -9.698717948717949
12 Scalp outside v7 27.0 52.0 0.34177215189873417 4015.0 -5784.0 148.7037037037037 -111.23076923076923 -22.392405063291136
11 Scalp outside v6 29.0 60.0 0.3258426966292135 6816.0 -8878.0 235.0344827586207 -147.96666666666667 -23.168539325842687
9 Scalp outside v4 48.0 104.0 0.3157894736842105 8015.0 -11622.0 166.97916666666666 -111.75 -23.730263157894747

huangapple
  • 本文由 发表于 2023年2月10日 11:26:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/75406645.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定