英文:
How to replace column values that are not in a particular range will null values using a conditional in python
问题
我有一个包含年龄列的数据框。其中一些值超出了我所期望的范围,我希望将它们替换为 null 值。我希望不在 20 到 50 之间的年龄被替换为 null 值。
这是我尝试过的,但似乎不起作用。
import pandas as pd
import numpy as np
age_range = (df['age'] < 20) | (df['age'] > 50)
df[age_range] = np.nan
英文:
I have a dataframe that contains a column for age. Some of the values are outside of my desired range and I want to replace them will null values. I want ages that are not in the range between 20 and 50 to be replaced with null values.
This is what I tried and it doesn't seem to work
import pandas as pd
import numpy as np
age_range = (df['age'] < 20) | (df['age'] > 50)
df[age_range = np.nan]
答案1
得分: 0
简单的语法错误。请执行以下操作:
import pandas as pd
import numpy as np
df = pd.DataFrame({'age': [18, 25, 35, 40, 55]})
age_range = (df['age'] < 20) | (df['age'] > 50)
df.loc[age_range, 'age'] = np.nan
print(df)
结果如下:
age
0 NaN
1 25.0
2 35.0
3 40.0
4 NaN
英文:
Simple syntax error. Do this
import pandas as pd
import numpy as np
df = pd.DataFrame({'age': [18, 25, 35, 40, 55]})
age_range = (df['age'] < 20) | (df['age'] > 50)
df.loc[age_range, 'age'] = np.nan
print(df)
which gives
age
0 NaN
1 25.0
2 35.0
3 40.0
4 NaN
答案2
得分: 0
你可以这样做:
import pandas as pd
import numpy as np
df = pd.DataFrame({'age': [18, 22, 35, 55, 42]})
df['age'] = np.where((df['age'] < 20) | (df['age'] > 50), np.nan, df['age'])
print(df)
输出:
age
0 NaN
1 22.0
2 35.0
3 NaN
4 42.0
英文:
You can do this:
import pandas as pd
import numpy as np
df = pd.DataFrame({'age': [18, 22, 35, 55, 42]})
df['age'] = np.where((df['age'] < 20) | (df['age'] > 50), np.nan, df['age'])
print(df)
Output:
age
0 NaN
1 22.0
2 35.0
3 NaN
4 42.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论