删除不满足条件的行。

huangapple go评论71阅读模式
英文:

Removing rows that does not meet a condition

问题

我想要获得每个个体在连续2天或更多天内的最低分数:

日期 姓名 得分
2020年1月3日 杰克 30
英文:

I have a dataframe like this:

Date	        Name	Score
Jan-1-2020	Jake	50
Jan-2-2020	Jake	30
Feb-1-2020	Paul	30
Jan-3-2020	Jake	30
Jan-2-2020	Paul	25

For each individual, I want to determine if they score less than 35% in a 2 or more days consecutive period

First, I arranged the table based on name and Date

Data = Data.sort_values(["Name","Date"], ascending = [True, True])
Date Name Score
Jan-1-2020 Jake 50.
Jan-2-2020 Jake 30
Jan-3-2020 Jake 30
Jan-2-2020 Paul 25
Feb-1-2020 Paul 30

I want to obtain one row for each individual that shows their minimum score over a period of 2 or more consecutive days:

Date Name Score
Jan-3-2020 Jake 30

答案1

得分: 1

你可以使用rolling.sum来统计每个2D窗口中小于等于35的值的数量:

df['Date'] = pd.to_datetime(df['Date'])

idx = (df
   .sort_values(by='Date')
   .assign(flag=lambda d: d['Score'].le(35))
   .groupby('Name', group_keys=False)
   .apply(lambda g: g.rolling('2D', on='Date')['flag'].sum())
)

print(df.loc[idx[idx>=2]])

输出结果:

        Date  Name  Score
2 2020-02-01  Paul     30
英文:

You can use a rolling.sum to count the number of values <= 35 per 2D:


df[&#39;Date&#39;] = pd.to_datetime(df[&#39;Date&#39;])

idx = (df
   .sort_values(by=&#39;Date&#39;)
   .assign(flag=lambda d: d[&#39;Score&#39;].le(35))
   .groupby(&#39;Name&#39;, group_keys=False)
   .apply(lambda g: g.rolling(&#39;2D&#39;, on=&#39;Date&#39;)[&#39;flag&#39;].sum())
)

print(df.loc[idx[idx&gt;=2]])

Output:

        Date  Name  Score
2 2020-02-01  Paul     30

huangapple
  • 本文由 发表于 2023年5月11日 03:38:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76222041.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定