英文:
How to replace column values with NaN based on index with pandas
问题
我有一个具有多列的数据框,索引以时间戳格式表示。我想要根据它们的索引定位特定列中的一系列行,并用NaN替换它们。我认为我需要结合使用.loc
和.replace
函数来实现这一目标。
示例输入,带有时间戳索引和三列的数据框:
Index 'A' 'B' 'C'
2023-02-03 10:00:00+00:00 0.1, 7, 8
2023-02-03 11:00:00+00:00 6, 5.6, 3.2
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2
2023-02-03 14:00:00+00:00 1.4, 7, 6.5
2023-02-03 15:00:00+00:00 2.6, -6, 4
期望的输出:
Index 'A' 'B' 'C'
2023-02-03 10:00:00+00:00 0.1, 7, 8
2023-02-03 11:00:00+00:00 6, 5.6, 3.2
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2
2023-02-03 14:00:00+00:00 1.4, NaN, 6.5
2023-02-03 15:00:00+00:00 2.6, NaN, 4
代码:
df2 = df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'], np.NaN)
这段代码不会报错,但它也不起作用:输出的df2
与df
相同。
谢谢!
英文:
I have a data frame with multiple columns, the index is in a Time Stamp format. I want to locate a range of rows within a specific column based on their index and replace them with NaN. I think I need to combine the .loc and .replace functions to do this.
Example Input, dataframe with time stamp index and three columns :
Index 'A' 'B' 'C'
2023-02-03 10:00:00+00:00 0.1, 7, 8
2023-02-03 11:00:00+00:00 6, 5.6, 3.2
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2
2023-02-03 14:00:00+00:00 1.4, 7, 6.5
2023-02-03 15:00:00+00:00 2.6, -6, 4
Desired Output:
Index 'A' 'B' 'C'
2023-02-03 10:00:00+00:00 0.1, 7, 8
2023-02-03 11:00:00+00:00 6, 5.6, 3.2
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2
2023-02-03 14:00:00+00:00 1.4, NaN, 6.5
2023-02-03 15:00:00+00:00 2.6, NaN, 4
The code:
df2=df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'],np.NaN)
Doesn't give an error, but it doesn't work either: output df2 is identical to df
Thanks!
答案1
得分: 0
不要替换,直接赋值:
```python
df2 = df.copy() # 如果需要保留原始数据
df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')
df2
:
Index A B C
0 2023-02-03 10:00:00+00:00 0.1 7.0 8.0
1 2023-02-03 11:00:00+00:00 6.0 5.6 3.2
2 2023-02-03 12:00:00+00:00 9.5 1.2 6.3
3 2023-02-03 13:00:00+00:00 -0.2 NaN 4.2
4 2023-02-03 14:00:00+00:00 1.4 NaN 6.5
5 2023-02-03 15:00:00+00:00 2.6 NaN 4.0
<details>
<summary>英文:</summary>
Don't replace, directly assign:
df2 = df.copy() # if needed to keep original
df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')
`df2`:
Index A B C
0 2023-02-03 10:00:00+00:00 0.1 7.0 8.0
1 2023-02-03 11:00:00+00:00 6.0 5.6 3.2
2 2023-02-03 12:00:00+00:00 9.5 1.2 6.3
3 2023-02-03 13:00:00+00:00 -0.2 NaN 4.2
4 2023-02-03 14:00:00+00:00 1.4 NaN 6.5
5 2023-02-03 15:00:00+00:00 2.6 NaN 4.0
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论