如何在Python中使用条件语句将不在特定范围内的列值替换为null值

huangapple go评论106阅读模式
英文:

How to replace column values that are not in a particular range will null values using a conditional in python

问题

我有一个包含年龄列的数据框。其中一些值超出了我所期望的范围,我希望将它们替换为 null 值。我希望不在 20 到 50 之间的年龄被替换为 null 值。

这是我尝试过的,但似乎不起作用。

  1. import pandas as pd
  2. import numpy as np
  3. age_range = (df['age'] < 20) | (df['age'] > 50)
  4. df[age_range] = np.nan
英文:

I have a dataframe that contains a column for age. Some of the values are outside of my desired range and I want to replace them will null values. I want ages that are not in the range between 20 and 50 to be replaced with null values.

This is what I tried and it doesn't seem to work

  1. import pandas as pd
  2. import numpy as np
  3. age_range = (df[&#39;age&#39;] &lt; 20) | (df[&#39;age&#39;] &gt; 50)
  4. df[age_range = np.nan]

答案1

得分: 0

简单的语法错误。请执行以下操作:

  1. import pandas as pd
  2. import numpy as np
  3. df = pd.DataFrame({'age': [18, 25, 35, 40, 55]})
  4. age_range = (df['age'] < 20) | (df['age'] > 50)
  5. df.loc[age_range, 'age'] = np.nan
  6. print(df)

结果如下:

  1. age
  2. 0 NaN
  3. 1 25.0
  4. 2 35.0
  5. 3 40.0
  6. 4 NaN
英文:

Simple syntax error. Do this

  1. import pandas as pd
  2. import numpy as np
  3. df = pd.DataFrame({&#39;age&#39;: [18, 25, 35, 40, 55]})
  4. age_range = (df[&#39;age&#39;] &lt; 20) | (df[&#39;age&#39;] &gt; 50)
  5. df.loc[age_range, &#39;age&#39;] = np.nan
  6. print(df)

which gives

  1. age
  2. 0 NaN
  3. 1 25.0
  4. 2 35.0
  5. 3 40.0
  6. 4 NaN

答案2

得分: 0

你可以这样做:

  1. import pandas as pd
  2. import numpy as np
  3. df = pd.DataFrame({'age': [18, 22, 35, 55, 42]})
  4. df['age'] = np.where((df['age'] < 20) | (df['age'] > 50), np.nan, df['age'])
  5. print(df)

输出:

  1. age
  2. 0 NaN
  3. 1 22.0
  4. 2 35.0
  5. 3 NaN
  6. 4 42.0
英文:

You can do this:

  1. import pandas as pd
  2. import numpy as np
  3. df = pd.DataFrame({&#39;age&#39;: [18, 22, 35, 55, 42]})
  4. df[&#39;age&#39;] = np.where((df[&#39;age&#39;] &lt; 20) | (df[&#39;age&#39;] &gt; 50), np.nan, df[&#39;age&#39;])
  5. print(df)

Output:

  1. age
  2. 0 NaN
  3. 1 22.0
  4. 2 35.0
  5. 3 NaN
  6. 4 42.0

huangapple
  • 本文由 发表于 2023年2月26日 23:26:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75573001.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定