英文:
Copying value from one column to another after filtering Dataframe - Simpler and shorter solution
问题
更新
最近,我找到了这个问题的一半解决方案,并想为所有试图弄清楚这个问题的人提供帮助。
我已经解决了当你要复制到NaN值的情况。
在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。
然后,在'Correction'列中,我有一些从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
输出:
Name Correction
0 A None
1 None B
2 C Q
# 将'Correction'列中的值复制到'Name'列,对于'Name'是None的行
df.loc[:, 'Name'].fillna(df[(df['Name'].isna() == True) & (df['Correction'].isna() == False)]['Correction'],
inplace=True)
输出:
Name Correction
0 A None
1 B B
2 C Q
对于非NaN情况,我已经找到了一个改进,如下面的链接所示。
但是根据一条评论,它可能会出现NaN值的问题,这对我来说可能会有问题。
https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed
问题
我正在对pandas数据框进行相当多的过滤,然后在筛选的同时从一列复制到另一列,同时仍然使用整个数据框。
因此,在对行进行子选择后,我从一列复制值到另一列。然后我使用完整的数据框(包括现在已经纠正的行)进行其他操作。
在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。
然后,在'Correction'列中,我有一些从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
输出:
Name Correction
0 A None
1 None B
2 C Q
# 将'Correction'列中的值复制到'Name'列,对于'Name'是None的行
df.loc[(df['Name'].isna() == True) & (df['Correction'].isna() == False), 'Name'] = \
df[(df['Name'].isna() == True) & (df['Correction'].isna() == False)]['Correction']
输出:
Name Correction
0 A None
1 B B
2 C Q
这确实可以工作,但可读性非常差。
是否有一种更“优雅”的解决方案,不必像这样连续两次重复筛选?
英文:
Update
I recently found half a solution to this and wanted to put up that for everyone else who is trying to figure this out.
I solved it for instances where you copy to nan values.
In the example below I have column 'Name', which is missing some names.
Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
out:
Name Correction
0 A None
1 None B
2 C Q
# Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'],
inplace=True)
out:
Name Correction
0 A None
1 B B
2 C Q
For non nan cases I have found an improvement as seen in the link below.
But as per a comment it might have issues with nan values which would pose problems for me.
https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed
Problem
I am doing a fair bit of filtering of a pandas dataframe and then copying from one column to another on that filter while still using the entire data frame.
So on a sub selection on rows I copy values from one column to another. Then I use the full dataframe (including the now corrected rows) for something.
In the example below I have column 'Name', which is missing some names.
Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
out:
Name Correction
0 A None
1 None B
2 C Q
# Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
df.loc[(df['Name'].isna()==True) & (df['Correction'].isna() == False), 'Name'] = \
df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction']
out:
Name Correction
0 A None
1 B B
2 C Q
That does work but the readability is horrible.
Is there a more "elegant" solution where you don't have to repeat the filter twice in a row like this.
答案1
得分: 1
部分回答,适用于将数据复制到NaN值的情况。
最近我找到了解决方案的一半,想为所有试图解决这个问题的人提供帮助。
我已经解决了在将数据复制到NaN值的情况下的情况。
在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。然后,在'Correction'列中,我有从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
输出:
Name Correction
0 A None
1 None B
2 C Q
复制'Correction'列中的值到'Name'列,对于'Name'为None的行
df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'],
inplace=True)
输出:
Name Correction
0 A None
1 B B
2 C Q
对于非NaN值,这是一个改进,但不是解决方案。
对于非NaN情况,我已经找到了一个改进,如下链接中所示。
但根据该线程中的一条评论,可能存在与NaN值相关的问题,这可能会导致问题。
https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed
英文:
Partial answer, will work for instances where you copy to nan values
I recently found half a solution to this and wanted to put up that for everyone else who is trying to figure this out.
I solved it for instances where you copy to nan values.
In the example below I have column 'Name', which is missing some names. Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.
df = pd.DataFrame({'Name': ['A', None, 'C'],
'Correction': [None, 'B', 'Q']})
out:
Name Correction
0 A None
1 None B
2 C Q
# Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'],
inplace=True)
out:
Name Correction
0 A None
1 B B
2 C Q
For non nan values, a improvement but not a solution
For non nan cases I have found an improvement as seen in the link below.
But as per a comment in the that threat it might have issues with nan values which would pose a problem.
https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论