如何使用Pandas根据索引将列值替换为NaN

huangapple go评论102阅读模式
英文:

How to replace column values with NaN based on index with pandas

问题

我有一个具有多列的数据框,索引以时间戳格式表示。我想要根据它们的索引定位特定列中的一系列行,并用NaN替换它们。我认为我需要结合使用.loc.replace函数来实现这一目标。

示例输入,带有时间戳索引和三列的数据框:

  1. Index 'A' 'B' 'C'
  2. 2023-02-03 10:00:00+00:00 0.1, 7, 8
  3. 2023-02-03 11:00:00+00:00 6, 5.6, 3.2
  4. 2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
  5. 2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2
  6. 2023-02-03 14:00:00+00:00 1.4, 7, 6.5
  7. 2023-02-03 15:00:00+00:00 2.6, -6, 4

期望的输出:

  1. Index 'A' 'B' 'C'
  2. 2023-02-03 10:00:00+00:00 0.1, 7, 8
  3. 2023-02-03 11:00:00+00:00 6, 5.6, 3.2
  4. 2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
  5. 2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2
  6. 2023-02-03 14:00:00+00:00 1.4, NaN, 6.5
  7. 2023-02-03 15:00:00+00:00 2.6, NaN, 4

代码:

  1. df2 = df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'], np.NaN)

这段代码不会报错,但它也不起作用:输出的df2df相同。

谢谢!

英文:

I have a data frame with multiple columns, the index is in a Time Stamp format. I want to locate a range of rows within a specific column based on their index and replace them with NaN. I think I need to combine the .loc and .replace functions to do this.

Example Input, dataframe with time stamp index and three columns :

  1. Index 'A' 'B' 'C'
  2. 2023-02-03 10:00:00+00:00 0.1, 7, 8
  3. 2023-02-03 11:00:00+00:00 6, 5.6, 3.2
  4. 2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
  5. 2023-02-03 13:00:00+00:00 -0.2, 1.1, 4.2
  6. 2023-02-03 14:00:00+00:00 1.4, 7, 6.5
  7. 2023-02-03 15:00:00+00:00 2.6, -6, 4

Desired Output:

  1. Index 'A' 'B' 'C'
  2. 2023-02-03 10:00:00+00:00 0.1, 7, 8
  3. 2023-02-03 11:00:00+00:00 6, 5.6, 3.2
  4. 2023-02-03 12:00:00+00:00 9.5, 1.2, 6.3
  5. 2023-02-03 13:00:00+00:00 -0.2, NaN, 4.2
  6. 2023-02-03 14:00:00+00:00 1.4, NaN, 6.5
  7. 2023-02-03 15:00:00+00:00 2.6, NaN, 4

The code:

  1. df2=df.replace(df.loc['2023-02-03 13:00:00+00:00':df.index[-1],'B'],np.NaN)

Doesn't give an error, but it doesn't work either: output df2 is identical to df

Thanks!

答案1

得分: 0

  1. 不要替换直接赋值
  2. ```python
  3. df2 = df.copy() # 如果需要保留原始数据
  4. df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')

df2:

  1. Index A B C
  2. 0 2023-02-03 10:00:00+00:00 0.1 7.0 8.0
  3. 1 2023-02-03 11:00:00+00:00 6.0 5.6 3.2
  4. 2 2023-02-03 12:00:00+00:00 9.5 1.2 6.3
  5. 3 2023-02-03 13:00:00+00:00 -0.2 NaN 4.2
  6. 4 2023-02-03 14:00:00+00:00 1.4 NaN 6.5
  7. 5 2023-02-03 15:00:00+00:00 2.6 NaN 4.0
  1. <details>
  2. <summary>英文:</summary>
  3. Don&#39;t replace, directly assign:

df2 = df.copy() # if needed to keep original

df2.loc['2023-02-03 13:00:00+00:00':df2.index[-1], 'B'] = float('nan')

  1. `df2`:
  1. Index A B C

0 2023-02-03 10:00:00+00:00 0.1 7.0 8.0
1 2023-02-03 11:00:00+00:00 6.0 5.6 3.2
2 2023-02-03 12:00:00+00:00 9.5 1.2 6.3
3 2023-02-03 13:00:00+00:00 -0.2 NaN 4.2
4 2023-02-03 14:00:00+00:00 1.4 NaN 6.5
5 2023-02-03 15:00:00+00:00 2.6 NaN 4.0

  1. </details>

huangapple
  • 本文由 发表于 2023年2月8日 20:39:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385914.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定