英文:
Create new column based on the values in a rolling window
问题
我有一个带有日期时间索引和包含整数的列的 DataFrame(在这个例子中,它只包含 0 和 1):
df = {
"date": pd.date_range(start="2010-01-01 12:00", end="2010-01-01 12:05", freq="T"),
"values": [1, 0, 0, 0, 1, 0]
}
日期 值
0 2010-01-01 12:00:00 1
1 2010-01-01 12:01:00 0
2 2010-01-01 12:02:00 0
3 2010-01-01 12:03:00 0
4 2010-01-01 12:04:00 1
5 2010-01-01 12:05:00 0
我想在 2 分钟的滚动时间窗口中,如果有 1,则返回 True,否则返回 False,如下所示:
日期 值
0 2010-01-01 12:00:00 True - 因为窗口 [1, 0] 包含 1
1 2010-01-01 12:01:00 False - 因为窗口 [0, 0] 不包含 1
2 2010-01-01 12:02:00 False
3 2010-01-01 12:03:00 True
4 2010-01-01 12:04:00 True
我尝试过使用 .groupby(),但进展不大。
英文:
I have a DataFrame with a DateTime index and a column containing integers (in this example it only contains 0 and 1):
df = {
"date": pd.date_range(start="2010-01-01 12:00", end="2010-01-01 12:05", freq="T"),
"values": [1, 0, 0, 0, 1, 0]
}
date values
0 2010-01-01 12:00:00 1
1 2010-01-01 12:01:00 0
2 2010-01-01 12:02:00 0
3 2010-01-01 12:03:00 0
4 2010-01-01 12:04:00 1
5 2010-01-01 12:05:00 0
I would like to return True if there is a 1 in a rolling time window of 2 minutes, otherwise False, as shown below:
date values
0 2010-01-01 12:00:00 True - because the window [1, 0] contains 1
1 2010-01-01 12:01:00 False - because the window [0, 0] does not contain 1
2 2010-01-01 12:02:00 False
3 2010-01-01 12:03:00 True
4 2010-01-01 12:04:00 True
I tried a .groupby() but I didn't get too far.
答案1
得分: 1
你可以使用 rolling
函数与日期时间索引:
df['date'] = pd.to_datetime(df['date'])
out = (
df.set_index('date')[::-1]
.rolling('2min').max()
.astype(bool)[::-1].reset_index()
)
或者:
out = (
df[::-1]
.rolling('2min', on='date').max()
.astype({'values': bool})[::-1]
)
输出结果:
date values
0 2010-01-01 12:00:00 True
1 2010-01-01 12:01:00 False
2 2010-01-01 12:02:00 False
3 2010-01-01 12:03:00 True
4 2010-01-01 12:04:00 True
5 2010-01-01 12:05:00 False
英文:
You can use rolling
with a datetime index:
df['date'] = pd.to_datetime(df['date'])
out = (
df.set_index('date')[::-1]
.rolling('2min').max()
.astype(bool)[::-1].reset_index()
)
Or:
out = (
df[::-1]
.rolling('2min', on='date').max()
.astype({'values': bool})[::-1]
)
Output:
date values
0 2010-01-01 12:00:00 True
1 2010-01-01 12:01:00 False
2 2010-01-01 12:02:00 False
3 2010-01-01 12:03:00 True
4 2010-01-01 12:04:00 True
5 2010-01-01 12:05:00 False
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论