问题

我有一个Python pandas数据框，其中包含一些球队在多个时间段内的连胜记录，并且我想要按照时间顺序识别这些连胜记录。所以，我有如下数据：

import pandas as pd
data = pd.DataFrame({'period': list(range(1,7))+list(range(1,6)),
    'team_id':       ['A']*6 + ['B']*5,
    'win':           [1,1,1,0,1,1,1,0,0,1,1],
    'streak_length': [1,2,3,0,1,2,1,0,0,1,2]})
print(data)

我想要的结果是：

result = pd.DataFrame({'period': list(range(1,7))+list(range(1,6)),
    'team_id':       ['A']*6 + ['B']*5,
    'win':           [1,1,1,0,1,1,1,0,0,1,1],
    'streak_length': [1,2,3,0,1,2,1,0,0,1,2],
    'streak_id':     [1,1,1,None,2,2,1,None,None,2,2]})
print(result)

我尝试了按`team_id`分组并对连胜长度求和，但可能会出现重复，所以我认为这种方法行不通。感谢任何帮助！

英文:

I have a Python pandas dataframe with winning streaks for some teams over several time periods and I would like to identfy the streaks chronologically. So, what I have is:

import pandas as pd
data = pd.DataFrame({&#39;period&#39;: list(range(1,7))+list(range(1,6)),
    &#39;team_id&#39;:       [&#39;A&#39;]*6 + [&#39;B&#39;]*5,
    &#39;win&#39;:           [1,1,1,0,1,1,1,0,0,1,1],
    &#39;streak_length&#39;: [1,2,3,0,1,2,1,0,0,1,2]})
print(data)

And what I would like to have is:

result = pd.DataFrame({&#39;period&#39;: list(range(1,7))+list(range(1,6)),
    &#39;team_id&#39;:       [&#39;A&#39;]*6 + [&#39;B&#39;]*5,
    &#39;win&#39;:           [1,1,1,0,1,1,1,0,0,1,1],
    &#39;streak_length&#39;: [1,2,3,0,1,2,1,0,0,1,2],
    &#39;streak_id&#39;:     [1,1,1,None,2,2,1,None,None,2,2]})
print(result)

I tried to groupby by team_id and sum over streak length, but it can be repeated, so I think this would not work. Any help appreciated!

答案1

得分: 6

使用Series.shift、Series.ne和Series.cumsum创建连续的分组，仅筛选win中的1，并使用GroupBy.transform和lambda函数中的factorize：

m = data['win'].eq(1)
g = data['win'].ne(data['win'].shift()).cumsum()

data['streak_id'] = g[m].groupby(data['team_id']).transform(
    lambda x: pd.factorize(x)[0] + 1
)

打印结果如下：

   period team_id  win  streak_length  streak_id
0       1       A    1              1        1.0
1       2       A    1              2        1.0
2       3       A    1              3        1.0
3       4       A    0              0        NaN
4       5       A    1              1        2.0
5       6       A    1              2        2.0
6       1       B    1              1        1.0
7       2       B    0              0        NaN
8       3       B    0              0        NaN
9       4       B    1              1        2.0
10      5       B    1              2        2.0

英文:

Create consecutive groups by Series.shift Series.ne and Series.cumsum, filter only 1 in win and use GroupBy.transform with factorize in lambda function:

m = data[&#39;win&#39;].eq(1)
g = data[&#39;win&#39;].ne(data[&#39;win&#39;].shift()).cumsum()

data[&#39;streak_id&#39;] = g[m].groupby(data[&#39;team_id&#39;]).transform(
    lambda x: pd.factorize(x)[0] + 1
)

print (data)
    period team_id  win  streak_length  streak_id
0        1       A    1              1        1.0
1        2       A    1              2        1.0
2        3       A    1              3        1.0
3        4       A    0              0        NaN
4        5       A    1              1        2.0
5        6       A    1              2        2.0
6        1       B    1              1        1.0
7        2       B    0              0        NaN
8        3       B    0              0        NaN
9        4       B    1              1        2.0
10       5       B    1              2        2.0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在pandas中获取连续数字胜利记录的ID。

问题

答案1

无法处理的实体，使用 fastapi 发送 POST 请求？

在Python中下载一个Zip文件并解压其内容

在使用Google的TPU时，在Colab中导入Causal Impact时出现问题。

如何使`cv2.HoughLinesP` 仅检测垂直线？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论