如何使用Pandas根据索引将列值替换为NaN

huangapple go评论79阅读模式
英文:

How to replace column values with NaN based on index with pandas

问题

我有一个具有多列的数据框,索引以时间戳格式表示。我想要根据它们的索引定位特定列中的一系列行,并用NaN替换它们。我认为我需要结合使用.loc.replace函数来实现这一目标。

示例输入,带有时间戳索引和三列的数据框:

Index                     'A'  'B' 'C'  
2023-02-03 10:00:00+00:00 0.1, 7, 8  
2023-02-03 11:00:00+00:00 6, 5.6, 3.2   
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3  
2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2  
2023-02-03 14:00:00+00:00 1.4, 7, 6.5  
2023-02-03 15:00:00+00:00 2.6, -6, 4  

期望的输出:

Index                     'A'  'B' 'C'  
2023-02-03 10:00:00+00:00 0.1, 7, 8  
2023-02-03 11:00:00+00:00 6, 5.6, 3.2   
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3  
2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2  
2023-02-03 14:00:00+00:00 1.4, NaN, 6.5  
2023-02-03 15:00:00+00:00 2.6, NaN, 4  

代码:

df2 = df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'], np.NaN)

这段代码不会报错,但它也不起作用:输出的df2df相同。

谢谢!

英文:

I have a data frame with multiple columns, the index is in a Time Stamp format. I want to locate a range of rows within a specific column based on their index and replace them with NaN. I think I need to combine the .loc and .replace functions to do this.

Example Input, dataframe with time stamp index and three columns :

Index                     'A'  'B' 'C'  
2023-02-03 10:00:00+00:00 0.1, 7, 8  
2023-02-03 11:00:00+00:00 6, 5.6, 3.2   
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3  
2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2  
2023-02-03 14:00:00+00:00 1.4, 7, 6.5  
2023-02-03 15:00:00+00:00 2.6, -6, 4  

Desired Output:

Index                     'A'  'B' 'C'  
2023-02-03 10:00:00+00:00 0.1, 7, 8  
2023-02-03 11:00:00+00:00 6, 5.6, 3.2   
2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3  
2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2  
2023-02-03 14:00:00+00:00 1.4, NaN, 6.5  
2023-02-03 15:00:00+00:00 2.6, NaN, 4  

The code:

df2=df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'],np.NaN)

Doesn't give an error, but it doesn't work either: output df2 is identical to df

Thanks!

答案1

得分: 0

不要替换直接赋值
```python
df2 = df.copy() # 如果需要保留原始数据

df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')

df2:

                       Index    A    B    C
0  2023-02-03 10:00:00+00:00  0.1  7.0  8.0
1  2023-02-03 11:00:00+00:00  6.0  5.6  3.2
2  2023-02-03 12:00:00+00:00  9.5  1.2  6.3
3  2023-02-03 13:00:00+00:00 -0.2  NaN  4.2
4  2023-02-03 14:00:00+00:00  1.4  NaN  6.5
5  2023-02-03 15:00:00+00:00  2.6  NaN  4.0

<details>
<summary>英文:</summary>

Don&#39;t replace, directly assign:

df2 = df.copy() # if needed to keep original

df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')

`df2`:
                   Index    A    B    C

0 2023-02-03 10:00:00+00:00 0.1 7.0 8.0
1 2023-02-03 11:00:00+00:00 6.0 5.6 3.2
2 2023-02-03 12:00:00+00:00 9.5 1.2 6.3
3 2023-02-03 13:00:00+00:00 -0.2 NaN 4.2
4 2023-02-03 14:00:00+00:00 1.4 NaN 6.5
5 2023-02-03 15:00:00+00:00 2.6 NaN 4.0


</details>



huangapple
  • 本文由 发表于 2023年2月8日 20:39:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385914.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定