Pandas 计算“自高点以来的柱数”

huangapple go评论62阅读模式
英文:

Pandas calculate "bars since high"

问题

我试图计算一个滚动的“距离最高点的 bars 数”,当出现新的最高点时,它会重置,使用 Pandas。我可以计算滚动的最高点,但无法计算自那以后的行数。

例如:

import pandas as pd

df = pd.DataFrame([0,1,2,3,10,3,4,5,25], columns=['price'])
df['high'] = df['price'].rolling(window=100000, min_periods=1).max()

在这种情况下,期望的输出将是:

df['barssincehigh'] = [0,0,0,0,0,1,2,3,0]

但我想不出一种计算自最近高点以来的行数的方法。

英文:

I am trying to calculate a rolling "bars since high" number that resets as new highs are made in Pandas. I can calculate the rolling highs but not the number of rows since that happened.

For example:

import pandas as pd

df = pd.DataFrame([0,1,2,3,10,3,4,5,25],columns=['price'])
df['high'] = df['price'].rolling(window=100000,min_periods=1).max()

in this case, the desired output would be:

df['barssincehigh'] = [0,0,0,0,0,1,2,3,0]

But I can't think of a way of calculating number of rows since the most recent high.

答案1

得分: 1

如果您的窗口大小是一个任意大的数字,并且您实际上想要计算数据帧中的累积最大值,您可以使用:

df["high"] = df.price.cummax()
# 替代:
df['high'] = df['price'].rolling(window=100000, min_periods=1).max()

要计算每个新最大值的计数:

df["bars_since_high"] = df.groupby("high").cumcount()

这种分组将有效,因为必然会存在唯一的分组,因为每个最大值必须大于所有先前的最大值。

英文:

If your window is an arbitrary large number, and you actually want to calculate the cumulative maximum across the dataframe, you can use:

df["high"] = df.price.cummax()
# replaces:
df['high'] = df['price'].rolling(window=100000,min_periods=1).max()

To calculate counts for each new maximum:

df["bars_since_high"] = df.groupby("high").cumcount()

This grouping will work because there will necessarily be unique groups, as each maximum must be greater than all previous maximums.

huangapple
  • 本文由 发表于 2023年6月9日 03:02:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76434958.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定