英文:
Column in pandas dataframe with None and numbers can't be modified
问题
我想将Key1
列中的None
值更改为默认值10.0。以下是要更改的代码行:
df.loc[df['Key1']==None,'Key1']=10.0
print(df)
这个方法返回:
Key1 OtherKey AnotherKey
0 None 2 whatever
1 2 3 whenever
而这个方法:
df.loc[df['Key1'].isnull(),'Key1']=10
print(df)
返回如下(符合预期):
Key1 OtherKey AnotherKey
0 10 2 whatever
1 2 3 whenever
英文:
I am creating a pandas dataframe from a list of dictionaries. One column in the dataframe can
contain both None values and numeric values. I am trying to change the cells with None to a default value (below is 10.0).
I am not understanding why the first of the two ways does not work while the second one works. I hope any pandas expert can help me.
## Creation of pandas dataframe
t = [{'Key1':None,'OtherKey':2, 'AnotherKey':'whatever'},{'Key1':2,'OtherKey':3, 'AnotherKey':'whenever'}]
df = pd.DataFrame(index=range(2), columns=['Key1','OtherKey','AnotherKey'])
idr = 0
for r in t:
df.iloc[idr]=r
idr += 1
I would like to change the None
value in the Key1
column to the default value of 10. The lines below
df.loc[df['Key1']==None,'Key1']=10.0
print(df)
returns
Key1 OtherKey AnotherKey
0 None 2 whatever
1 2 3 whenever
while this method
df.loc[df['Key1'].isnull(),'Key1']=10
print(df)
returns as expected:
Key1 OtherKey AnotherKey
0 10 2 whatever
1 2 3 whenever
Any expert that can explain me the difference between the two methods? Thanks
答案1
得分: 1
One credit to our fellow's answer above. To add, in my experience, it is always better to run df.info()
first. In your case, Key1
columns is objective
which will potentially causes certain unnecessary errors later while you're manipulating your data.
My suggestion is after running df.info()
, for columns that are numeric
but are objective
, use pd.to_numeric
.
The code as follows:
df['Key1'] = pd.to_numeric(df['Key1'], errors='coerce')
This should help a lot with your data manipulation later.
英文:
One creadit to our fellow's answer above. To add, in my experience, it is always better to run df.info()
first. In your case, Key1
columns is objective
which will potentially causes certain unnecessary errors later while your manipulating your data.
My suggestion is after running df.info()
, for column that is numeric
but is objective
, use pd.to_numeric
The code as follow:
df['Key1'] = pd.to_numeric(df['Key1'], errors='coerce')
This should helps a lot with your data manipulating later
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论