英文:
Rolling Mean/Average within a For Loop on a Dataframe Python
问题
I went through a bunch of posts and couldn't find a more "python" appropriate solution. What I have is a dataframe, then I run a For Loop to calculate several metrics. Throughout the loop many of the columns are dependent on each other so I want to calculate everything up until that point. The issue is that the only way to make the rolling method work within the loop is to run it (or so I think) for the entire column every iteration. I am sure there has to be a better way. Here is a Sample, I have the following DF, where the Column Value is generated within a For Loop:
Minute Value Rolling Mean
1 3 0
2 5 0
3 8 5.3333
4 4 5.6667
5 6 6
6 7 5.6667
what I am using to calculate the mean through the for loop is this:
n_periods = 3
df['Rolling Mean'] = df['Value'].rolling(n_periods, min_periods = 0).mean(skipna=False)
The problem here is that as you iterate within the for loop it reruns the entire column every row, and I have thousands of them, so it is very slow.
I would want something more like this (which doesn't work), which would only run one calculation per row throughout the loop.
for i in range(1, len(df)):
df.at[i, 'Rolling Mean'] = df.at[i, 'Value'].rolling(n_periods, min_periods=0).mean(skipna=False)
Any thoughts?
Thank you all!
英文:
I went through a bunch of posts and couldn't find a more "python" appropiate solution. What I have is a dataframe, then I run a For Loop to calculate several metrics. Throughout the loop many of the columns are dependent on each other so I want to calculate everything up until that point. The issue is that the only way to make the rolling method work within the loop is to run it (or so I think) for the entire column every iteration. I am sure there has to be a better way. Here is a Sample, I have the following DF, where the Column Value is generated within a For Loop:
Minute Value Rolling Mean
1 3 0
2 5 0
3 8 5.3333
4 4 5.6667
5 6 6
6 7 5.6667
what I am using to calculate the mean through the for loop is this:
n_periods = 3
df['Rolling Mean'] = df['Value'].rolling(n_periods, min_periods = 0).mean(skipna=False)
The proble here is that as you iterate within the for loop it reruns the entire column every row, and I have thousands of them, so it is very slow.
I would want something more like this (which doesn't work), which would only run one calculation per row throughout the loop.
for i in range(1, len(df)):
df.at[i, 'Rolling Mean'] = df.at[i, 'Value'].rolling(n_periods, min_periods=0).mean(skipna=False)
Any thoughts?
Thank you all!
答案1
得分: 0
以下是翻译好的代码部分:
# 遍历DataFrame的索引
for i in range(1, len(df)):
# 使用df.at计算滚动均值并赋值给 'Rolling Mean' 列
df.at[i, 'Rolling Mean'] = df['Value'][0:i+1].rolling(window=n_periods, min_periods=0).mean().iloc[-1]
希望这对您有帮助!
英文:
Ok, so after some trial and error, I figured this line worked. In case anybody else needs it:
# Loop through the indices of the DataFrame
for i in range(1, len(df)):
# Calculate the rolling mean using df.at and assign it to the 'Rolling Mean' column
df.at[i, 'Rolling Mean'] = df['Value'][0:i+1].rolling(window=n_periods, min_periods=0).mean().iloc[-1]
Thank you all for the support! this community is amazing!
答案2
得分: 0
也许这样更好,如果你尝试理解滚动均值以及你想要它中心在哪。然后你只需实现数学:
for i in range(1, len(df)):
df.at[i, 'Rolling Mean'] = df.loc[:i, 'Value'].tail(n_periods).mean()
- 这比你的答案需要更少的计算资源,因为它不是从你的 DataFrame 的开头开始。
- 窗口是右对齐的,就像你的例子一样。一般来说,可以向
df.rolling
传递center=True
,但如果你不知道未来的值,那就行不通。你应该意识到这种区别。
英文:
Maybe it's better, if you try to understand what a rolling mean is and where you want it centered. Then you just implement the math:
for i in range(1, len(df)):
df.at[i, 'Rolling Mean'] = df.loc[:i, 'Value'].tail(n_periods).mean()
- This needs much less computing power than your answer because it doesn't start from the beginning of your DataFrame.
- The window is right-aligned just like in your example. In general, one can pass
center=True
todf.rolling
but that doesn't work if you don't know the future values. You should be aware of that difference.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论