英文:
pandas rolling apply on a custom function
问题
我想在滚动基础上应用pandas.rank。
我尝试使用pandas.rolling.apply,但不幸的是rolling不与'rank'一起工作。
有没有解决办法?
df = pd.DataFrame(np.random.randn(10, 3))
def my_rank(x):
return x.rank(pct=True)
df.rolling(3).apply(my_rank)
英文:
I would like to apply pandas.rank on a rolling basis.
I tried to used pandas.rolling.apply but unfortunately rolling doesn't work with 'rank'.
Is there a way around?
df = pd.DataFrame(np.random.randn(10, 3))
def my_rank(x):
return x.rank(pct=True)
df.rolling(3).apply(my_rank)
答案1
得分: 2
代码部分已经翻译如下:
def my_rank(x):
return pd.Series(x).rank(pct=True).iloc[-1}
df.rolling(3).apply(my_rank)
请注意,翻译仅包括代码部分,不包括解释部分。
英文:
Code:
def my_rank(x):
return pd.Series(x).rank(pct=True).iloc[-1]
df.rolling(3).apply(my_rank)
Output:
0 1 2
0 NaN NaN NaN
1 NaN NaN NaN
2 0.666667 0.333333 0.666667
3 1.000000 0.333333 1.000000
4 0.666667 1.000000 0.333333
5 0.333333 0.666667 0.666667
6 1.000000 0.333333 0.666667
7 0.333333 0.333333 1.000000
8 1.000000 0.666667 1.000000
9 0.666667 1.000000 0.666667
Explanation:
Your code (great minimal reproduceable example btw!) threw the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'rank'
.
Which meant the x
in your my_rank
function was getting passed as a numpy array, not a pandas Series. So first I updated return x.rank...
to return pd.Series(x).rank..
Then I got the following error:
TypeError: cannot convert the series to <class 'float'>
Which makes sense, because pd.Series.rank
takes a series of n numbers and returns a (ranked) series of n numbers. But since we're calling rank not once on a series, but repeatedly on a rolling window of a series, we only want one number as output for each rolling calculation. Hence the iloc[-1]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论