2023年2月10日 11:26:39go评论91阅读模式

英文:

How to compute expectancy in a dataframe across rows

问题

我有一个包含日期、标识、策略和盈亏的数据框。我想以几种方式分析和比较盈亏。

我想要在按标识和策略分组时获取胜率和期望值。所以我已经这样做了：

def stats(s):
    winrate = s['isWinner']['count'] / (s['isWinner']['count'] + s['isLoser']['count'])
    expectancy = s['isWinner']['mean'] * winrate - s['isLoser']['mean'] * (1.0 - winrate)

然后我为数据框添加了“isWinner”和“isLoser”列：

df["isWinner"] = df['pnl'] >= 0
df["isLoser"] = df['pnl'] < 0

接下来，我对数据框进行分组，计算各种统计数据：

df2 = df.groupby(['day', 'symbol', 'strategy', 'isWinner']).agg({'pnl': ['count', 'mean', 'std', 'min', 'max']})

最后，我对df2再次进行分组，应用之前定义的stats函数：

df2.groupby(['day', 'symbol', 'strategy']).agg(stats)

在stats函数中，您似乎有一个问题，不能像s['isWinner']这样使用。可能是因为stats函数没有返回winrate和expectancy的值。您可以在函数中添加return winrate, expectancy语句来返回这些值。然后，在调用agg函数时，您可以指定要应用的函数，并将结果添加到df2中：

df2 = df2.groupby(['day', 'symbol', 'strategy']).agg(stats)

这样，您将在df2中包含每个分组的胜率和期望值。

至于是否有更好的方法，创建df2并应用stats函数是一种常见的方法，可以帮助您对数据进行更灵活的分析。如果您想要更多的分析或可视化，可能需要保留df2。但如果您只对胜率和期望值感兴趣，您可以在df上直接计算它们，而无需创建df2。这取决于您的具体需求和数据分析的复杂性。

英文:

I have a dataframe that contains day, symbol, strategy, and pnl. I want to analyze and compare pnl in a couple of ways.

I'd like to get the win-rate & expectancy when grouped by symbol and strategy. So I've done this:

def stats(s):
    winrate = s[&#39;isWinner&#39;][&#39;count&#39;] / (s[&#39;isWinner&#39;][&#39;count&#39;] + s[&#39;isLoser&#39;][&#39;count&#39;])
    expectancy = s[&#39;isWinner&#39;][&#39;mean&#39;] * winrate - s[&#39;isLoser&#39;][&#39;mean&#39;] * (1.0 - winrate)
df[&quot;isWinner&quot;] = df[&#39;pnl&#39;] &gt;= 0
df[&quot;isLoser&quot;] = df[&#39;pnl&#39;] &lt; 0
df2 = df.groupby([&#39;day&#39;, &#39;symbol&#39;, &#39;strategy&#39;, &#39;isWinner&#39;]).agg({&#39;pnl&#39;: [&#39;count&#39;, &#39;mean&#39;, &#39;std&#39;, &#39;min&#39;, &#39;max&#39;]})
df2.groupby([&#39;day&#39;, &#39;symbol&#39;, &#39;strategy&#39;]).agg(stats)

Apparently, I can't do s['isWinner'] in the stats function. What am I doing wrong?

Once the stats function works, how do I add winrate and expectancy to df2?

Am I going about this the right way? Is it necessary to create df2 from df, or is there a better way?

答案1

得分: 0

以下是代码的翻译结果：

索引	名称	获胜次数	失利次数	胜率	利润	亏损	平均盈利	平均亏损	期望值
1	跟随外部蜡烛期权筛选 15	93.0	105.0	0.4696969696969697	36898.0	-20096.0	396.752688172043	-191.3904761904762	84.85858585858587
6	外部蜡烛剥头皮期权	11.0	20.0	0.3548387096774194	3409.0	-2971.0	309.90909090909093	-148.55	14.129032258064527
0	跟随外部蜡烛期权	650.0	980.0	0.3987730061349693	200595.0	-178813.0	308.60769230769233	-182.46224489795918	13.36319018404906
10	剥头皮外部 V5	535.0	886.0	0.3764954257565095	125250.0	-108215.0	234.11214953271028	-122.13882618510158	11.988036593947925
5	外部 V4	151.0	163.0	0.48089171974522293	6257.0	-5105.0	41.437086092715234	-31.319018404907975	3.6687898089171966
4	外部 V3	110.0	172.0	0.3900709219858156	7813.0	-6852.0	71.02727272727273	-39.83720930232558	3.4078014184397176
2	外部 V1	113.0	151.0	0.42803030303030304	10498.0	-9790.0	92.90265486725664	-64.83443708609272	2.68181818181818
13	剥头皮外部 V8	607.0	729.0	0.45434131736526945	79443.0	-79559.0	130.87808896210873	-109.13443072702331	-0.08682634730538297
15	剥头皮 V2	78.0	103.0	0.430939226519337	1702.0	-1748.0	21.82051282051282	-16.97087378640777	-0.25414364640883846
3	外部 V2	81.0	117.0	0.4090909090909091	7603.0	-7773.0	93.8641975308642	-66.43589743589743	-0.8585858585858475
14	剥头皮 V1	47.0	53.0	0.47	833.0	-1402.0	17.72340425531915	-26.452830188679247	-5.69
7	剥头皮外部 V2	87.0	124.0	0.41232227488151657	18476.0	-20010.0	212.367816091954	-161.3709677419355	-7.270142180094808
8	剥头皮外部 V3	66.0	90.0	0.4230769230769231	9255.0	-10768.0	140.22727272727272	-119.64444444444445	-9.698717948717949
12	剥头皮外部 V7	27.0	52.0	0.34177215189873417	4015.0	-5784.0	148.7037037037037	-111.23076923076923	-22.392405063291136
11	剥头皮外部 V6	29.0	60.0	0.3258426966292135	6816.0	-8878.0	235.0344827586207	-147.96666666666667	-23.168539325842687
9	剥头皮外部 V4	48.0	104.0	0.3157894736842105	8015.0	-11622.0	166.97916666666666	-111.75	-23.730263157894747

英文:

I'm sure there is a more pythonic way to do this, but it works.

    df[&quot;isWinner&quot;] = df[&#39;pnl&#39;] &gt;= 0
    df[&quot;isLoser&quot;] = df[&#39;pnl&#39;] &lt; 0
    grouped = df.groupby([&#39;strategy&#39;])
    stats = pd.DataFrame()
    for name, grp in grouped:
      wincount = grp[&#39;isWinner&#39;].values.sum()
      loscount = grp[&#39;isLoser&#39;].values.sum()
      winrate = wincount / (wincount+loscount)
      profits = grp[grp[&#39;isWinner&#39;]][&#39;pnl&#39;].sum()
      losses = grp[grp[&#39;isLoser&#39;]][&#39;pnl&#39;].sum()
      avgwin = profits/wincount
      avglos = losses/loscount
      expectancy = winrate * avgwin + (1.0 - winrate) * avglos
      row = {&#39;name&#39;: name, &#39;wincount&#39;: wincount, &#39;losscount&#39;: loscount, &#39;winrate&#39;: winrate, &#39;profits&#39;: profits, &#39;losses&#39;: losses, &#39;avgwin&#39;: avgwin, &#39;avgloss&#39;: avglos, &#39;expectancy&#39;: expectancy}
      stats = stats.append(row, ignore_index=True)
    stats

Results:

index	name	wincount	losscount	winrate	profits	losses	avgwin	avgloss	expectancy
1	Follow outside candles OPTIONS Filter 15	93.0	105.0	0.4696969696969697	36898.0	-20096.0	396.752688172043	-191.3904761904762	84.85858585858587
6	Scalp outside candles OPTIONS	11.0	20.0	0.3548387096774194	3409.0	-2971.0	309.90909090909093	-148.55	14.129032258064527
0	Follow outside candles OPTIONS	650.0	980.0	0.3987730061349693	200595.0	-178813.0	308.60769230769233	-182.46224489795918	13.36319018404906
10	Scalp outside v5	535.0	886.0	0.3764954257565095	125250.0	-108215.0	234.11214953271028	-122.13882618510158	11.988036593947925
5	Outside v4	151.0	163.0	0.48089171974522293	6257.0	-5105.0	41.437086092715234	-31.319018404907975	3.6687898089171966
4	Outside v3	110.0	172.0	0.3900709219858156	7813.0	-6852.0	71.02727272727273	-39.83720930232558	3.4078014184397176
2	Outside v1	113.0	151.0	0.42803030303030304	10498.0	-9790.0	92.90265486725664	-64.83443708609272	2.68181818181818
13	Scalp outside v8	607.0	729.0	0.45434131736526945	79443.0	-79559.0	130.87808896210873	-109.13443072702331	-0.08682634730538297
15	Scalp v2	78.0	103.0	0.430939226519337	1702.0	-1748.0	21.82051282051282	-16.97087378640777	-0.25414364640883846
3	Outside v2	81.0	117.0	0.4090909090909091	7603.0	-7773.0	93.8641975308642	-66.43589743589743	-0.8585858585858475
14	Scalp v1	47.0	53.0	0.47	833.0	-1402.0	17.72340425531915	-26.452830188679247	-5.690000000000001
7	Scalp outside v2	87.0	124.0	0.41232227488151657	18476.0	-20010.0	212.367816091954	-161.3709677419355	-7.270142180094808
8	Scalp outside v3	66.0	90.0	0.4230769230769231	9255.0	-10768.0	140.22727272727272	-119.64444444444445	-9.698717948717949
12	Scalp outside v7	27.0	52.0	0.34177215189873417	4015.0	-5784.0	148.7037037037037	-111.23076923076923	-22.392405063291136
11	Scalp outside v6	29.0	60.0	0.3258426966292135	6816.0	-8878.0	235.0344827586207	-147.96666666666667	-23.168539325842687
9	Scalp outside v4	48.0	104.0	0.3157894736842105	8015.0	-11622.0	166.97916666666666	-111.75	-23.730263157894747

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在数据框中跨行计算期望值

问题

答案1

Accessing C pointers to vertices in Blender’s Python API.

“ValueError while enumerating list” 中文翻译：在枚举列表时发生数值错误。

Counting the frequency of ‘BB’ or ‘EE’ in a string using Python

Python单元测试模拟函数调用原始函数并失败。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。