英文:
Pandas check if any of previous n rows met criteria
问题
以下是您要翻译的部分:
# get latest 3 records
df['blue_condition'] = df.tail(3)
# assign using lambda
df['blue_condition'].assign(blue_condition=lambda x: (x.tail(3).query(top < top.tail(3))))
请注意,这段代码似乎包含了HTML转义字符,如果需要运行,请将其替换为正常的Python代码。
英文:
Say i have this data set.
| | color | down | top |
| -- | ------ | ---- | --- |
| 0 | | 1 | 5 |
| 1 | | 2 | 5 |
| 2 | blue | 7 | 11 |
| 3 | | 5 | 8 |
| 4 | | 9 | 10 |
| 5 | | 9 | 10 |
| 6 | orange | 5 | 9 |
| 7 | | 4 | 7 |
| 8 | | 5 | 10 |
| 9 | | 5 | 6 |
| 10 | | 3 | 7 |
I want to flag rows where any of the 3 previous rows are either blue or orange AND where current top is between down and top of that row.
The outcome of this operation would be a dataset:
| | color | down | top | blue_condition | orange_condition |
| -- | ------ | ---- | --- | -------------- | ---------------- |
| 0 | | 1 | 5 | | |
| 1 | | 2 | 5 | | |
| 2 | blue | 7 | 11 | | |
| 3 | | 5 | 8 | 1 | |
| 4 | | 9 | 10 | 1 | |
| 5 | | 9 | 10 | 1 | |
| 6 | orange | 5 | 9 | | |
| 7 | | 4 | 7 | | 1 |
| 8 | | 5 | 10 | | |
| 9 | | 5 | 6 | | 1 |
| 10 | | 3 | 7 | | |
I have been fiddling around with combinations between .tail()
, .filter()
and .assign()
. But I am a bit stuck to be honest.
df = pd.DataFrame({"color": [None, None, 'blue', None, None, None, 'orange', None, None, None, None],
'down': [1, 2, 7, 5, 9, 9, 5, 4, 5, 5, 3],
'top': [5, 5, 11, 8, 10, 10, 9, 7, 10, 6, 7]})
# get latest 3 records
df['blue_condition'] = df.tail(3)
# assign using lambda
df['blue_condition'].assign(blue_condition=lambda x: (x.tail(3).query(top < top.tail(3))))
I have looked into other questions but they don't take the current row as a reference, from what I can tell.
https://stackoverflow.com/questions/74734980/pandas-return-true-if-condition-true-in-any-of-previous-n-rows
答案1
得分: 2
只返回翻译好的代码部分:
N = 3
df2 = (df.pivot(columns='color', values=['top', 'down'])
.drop(columns=np.nan, level=1)
.ffill(limit=N-1).shift()
)
out = df.join((df2['down'].le(df['top'], axis=0)
& df2['top'].ge(df['top'], axis=0)).astype(int))
英文:
You can use a pivot
, then ffill
+shift
the data and use this to compare to the top value:
N = 3
df2 = (df.pivot(columns='color', values=['top', 'down'])
.drop(columns=np.nan, level=1)
.ffill(limit=N-1).shift()
)
out = df.join((df2['down'].le(df['top'], axis=0)
& df2['top'].ge(df['top'], axis=0)).astype(int))
Output:
color down top blue orange
0 None 1 5 0 0
1 None 2 5 0 0
2 blue 7 11 0 0
3 None 5 8 1 0
4 None 9 10 1 0
5 None 9 10 1 0
6 orange 5 9 0 0
7 None 4 7 0 1
8 None 5 10 0 0
9 None 5 6 0 1
10 None 3 7 0 0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论