英文:
Pandas calculate "bars since high"
问题
我试图计算一个滚动的“距离最高点的 bars 数”,当出现新的最高点时,它会重置,使用 Pandas。我可以计算滚动的最高点,但无法计算自那以后的行数。
例如:
import pandas as pd
df = pd.DataFrame([0,1,2,3,10,3,4,5,25], columns=['price'])
df['high'] = df['price'].rolling(window=100000, min_periods=1).max()
在这种情况下,期望的输出将是:
df['barssincehigh'] = [0,0,0,0,0,1,2,3,0]
但我想不出一种计算自最近高点以来的行数的方法。
英文:
I am trying to calculate a rolling "bars since high" number that resets as new highs are made in Pandas. I can calculate the rolling highs but not the number of rows since that happened.
For example:
import pandas as pd
df = pd.DataFrame([0,1,2,3,10,3,4,5,25],columns=['price'])
df['high'] = df['price'].rolling(window=100000,min_periods=1).max()
in this case, the desired output would be:
df['barssincehigh'] = [0,0,0,0,0,1,2,3,0]
But I can't think of a way of calculating number of rows since the most recent high.
答案1
得分: 1
如果您的窗口大小是一个任意大的数字,并且您实际上想要计算数据帧中的累积最大值,您可以使用:
df["high"] = df.price.cummax()
# 替代:
df['high'] = df['price'].rolling(window=100000, min_periods=1).max()
要计算每个新最大值的计数:
df["bars_since_high"] = df.groupby("high").cumcount()
这种分组将有效,因为必然会存在唯一的分组,因为每个最大值必须大于所有先前的最大值。
英文:
If your window is an arbitrary large number, and you actually want to calculate the cumulative maximum across the dataframe, you can use:
df["high"] = df.price.cummax()
# replaces:
df['high'] = df['price'].rolling(window=100000,min_periods=1).max()
To calculate counts for each new maximum:
df["bars_since_high"] = df.groupby("high").cumcount()
This grouping will work because there will necessarily be unique groups, as each maximum must be greater than all previous maximums.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论