英文:
Pandas Dataframe interpolate inside with constant value
问题
如何实现:
[In1]: df = pd.DataFrame({
'col1': [100, np.nan, np.nan, 100, np.nan, np.nan, np.nan],
'col2': [np.nan, 100, np.nan, np.nan, np.nan, 100, np.nan]})
df
结果为:
[Out1]: col1 col2
0 100 NaN
1 NaN 100
2 NaN NaN
3 100 NaN
4 NaN NaN
5 NaN 100
6 NaN NaN
转化为:
[Out2]: col1 col2
0 100 NaN
1 0 100
2 0 0
3 100 0
4 NaN NaN
5 NaN 100
6 NaN NaN
所以基本上我想要在内部区域进行插值/填充NaN,使用limit=2
。请注意,在col2
中有三个连续的NaN,但只有其中两个被替换为零。
英文:
How to make:
[In1]: df = pd.DataFrame({
'col1': [100, np.nan, np.nan, 100, np.nan, np.nan, np.nan],
'col2': [np.nan, 100, np.nan, np.nan, np.nan, 100, np.nan]})
df
[Out1]: col1 col2
0 100 NaN
1 NaN 100
2 NaN NaN
3 100 NaN
4 NaN NaN
5 NaN 100
6 NaN NaN
into:
[Out2]: col1 col2
0 100 NaN
1 0 100
2 0 0
3 100 0
4 NaN NaN
5 NaN 100
6 NaN NaN
So basically I want to interpolate/fill NaN's with zero only for the inside area and a limit=2
. Note in col2
there are three consecutive NaN's in the middle and only two of them are replaced with zero.
答案1
得分: 1
以下是翻译好的部分:
你可以构建掩码来识别非-NAs,以及内部的值(借助双重 cummax
函数):
m = df.notna()
m2 = m.cummax() & m[::-1].cummax()
out = df.fillna(df.mask(m, 0).ffill(limit=2).where(m2))
或者使用 interpolate
函数:
m = df.notna()
out = df.fillna(df.mask(m, 0).interpolate(limit=2, limit_area='inside'))
# 或者如果只有数字
out = df.fillna(df.mul(0).interpolate(limit=2, limit_area='inside'))
输出结果:
col1 col2
0 100.0 NaN
1 0.0 100.0
2 0.0 0.0
3 100.0 0.0
4 NaN NaN
5 NaN 100.0
6 NaN NaN
英文:
You can build masks to identify the non-NAs, and the inner values (with help of a double cummax
):
m = df.notna()
m2 = m.cummax() & m[::-1].cummax()
out = df.fillna(df.mask(m, 0).ffill(limit=2).where(m2))
Or with interpolate
:
m = df.notna()
out = df.fillna(df.mask(m, 0).interpolate(limit=2, limit_area='inside'))
# or if you only have numbers
out = df.fillna(df.mul(0).interpolate(limit=2, limit_area='inside'))
Output:
col1 col2
0 100.0 NaN
1 0.0 100.0
2 0.0 0.0
3 100.0 0.0
4 NaN NaN
5 NaN 100.0
6 NaN NaN
答案2
得分: 0
我们可以这样做:
out = df.ffill(limit=2).mask(df.bfill().isna())
out = out.mask(out.ne(df) & out.notna(), 0)
Out[83]:
col1 col2
0 100.0 NaN
1 0.0 100.0
2 0.0 0.0
3 100.0 0.0
4 NaN NaN
5 NaN 100.0
6 NaN NaN
英文:
We could do
out = df.ffill(limit=2).mask(df.bfill().isna())
out = out.mask(out.ne(df) & out.notna(),0)
Out[83]:
col1 col2
0 100.0 NaN
1 0.0 100.0
2 0.0 0.0
3 100.0 0.0
4 NaN NaN
5 NaN 100.0
6 NaN NaN
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论