Copying value from one column to another after filtering Dataframe – Simpler and shorter solution

huangapple go评论59阅读模式
英文:

Copying value from one column to another after filtering Dataframe - Simpler and shorter solution

问题

更新

最近,我找到了这个问题的一半解决方案,并想为所有试图弄清楚这个问题的人提供帮助。

我已经解决了当你要复制到NaN值的情况。

在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。
然后,在'Correction'列中,我有一些从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。

df = pd.DataFrame({'Name': ['A', None, 'C'],               
              'Correction': [None, 'B', 'Q']})

输出:
   Name Correction
0     A  None
1  None     B
2     C     Q

# 将'Correction'列中的值复制到'Name'列,对于'Name'是None的行
df.loc[:, 'Name'].fillna(df[(df['Name'].isna() == True) & (df['Correction'].isna() == False)]['Correction'], 
                          inplace=True)

输出:
  Name Correction
0    A  None
1    B     B
2    C     Q

对于非NaN情况,我已经找到了一个改进,如下面的链接所示。

但是根据一条评论,它可能会出现NaN值的问题,这对我来说可能会有问题。

https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed

问题

我正在对pandas数据框进行相当多的过滤,然后在筛选的同时从一列复制到另一列,同时仍然使用整个数据框。

因此,在对行进行子选择后,我从一列复制值到另一列。然后我使用完整的数据框(包括现在已经纠正的行)进行其他操作。

在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。
然后,在'Correction'列中,我有一些从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。

df = pd.DataFrame({'Name': ['A', None, 'C'],               
              'Correction': [None, 'B', 'Q']})

输出:
   Name Correction
0     A  None
1  None     B
2     C     Q

# 将'Correction'列中的值复制到'Name'列,对于'Name'是None的行
df.loc[(df['Name'].isna() == True) & (df['Correction'].isna() == False), 'Name'] = \
      df[(df['Name'].isna() == True) & (df['Correction'].isna() == False)]['Correction']

输出:
  Name Correction
0    A  None
1    B     B
2    C     Q

这确实可以工作,但可读性非常差。

是否有一种更“优雅”的解决方案,不必像这样连续两次重复筛选?

英文:

Update

I recently found half a solution to this and wanted to put up that for everyone else who is trying to figure this out.

I solved it for instances where you copy to nan values.

In the example below I have column 'Name', which is missing some names.
Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.


    df = pd.DataFrame({'Name': ['A', None, 'C'],               
                  'Correction': [None, 'B', 'Q']})

out:
   Name Correction
0     A  None
1  None     B
2     C     Q

    # Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
    df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'], 
                              inplace=True)

out:
  Name Correction
0    A  None
1    B     B
2    C     Q

For non nan cases I have found an improvement as seen in the link below.

But as per a comment it might have issues with nan values which would pose problems for me.

https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed

Problem

I am doing a fair bit of filtering of a pandas dataframe and then copying from one column to another on that filter while still using the entire data frame.

So on a sub selection on rows I copy values from one column to another. Then I use the full dataframe (including the now corrected rows) for something.

In the example below I have column 'Name', which is missing some names.
Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.

    df = pd.DataFrame({'Name': ['A', None, 'C'],               
                  'Correction': [None, 'B', 'Q']})

out:
   Name Correction
0     A  None
1  None     B
2     C     Q

    # Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
    df.loc[(df['Name'].isna()==True) & (df['Correction'].isna() == False), 'Name'] = \
          df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction']

out:
  Name Correction
0    A  None
1    B     B
2    C     Q

That does work but the readability is horrible.

Is there a more "elegant" solution where you don't have to repeat the filter twice in a row like this.

答案1

得分: 1

部分回答,适用于将数据复制到NaN值的情况。

最近我找到了解决方案的一半,想为所有试图解决这个问题的人提供帮助。

我已经解决了在将数据复制到NaN值的情况下的情况。

在下面的示例中,我有一个名为'Name'的列,其中缺少一些名称。然后,在'Correction'列中,我有从其他地方获取的值,如果'Name'列中没有名称,我想将这些值填充到'Name'列中。

    df = pd.DataFrame({'Name': ['A', None, 'C'],               
                  'Correction': [None, 'B', 'Q']})

输出:

   Name Correction
0     A  None
1  None     B
2     C     Q

复制'Correction'列中的值到'Name'列,对于'Name'为None的行

df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'], 
                              inplace=True)

输出:

  Name Correction
0    A  None
1    B     B
2    C     Q

对于非NaN值,这是一个改进,但不是解决方案。

对于非NaN情况,我已经找到了一个改进,如下链接中所示。

但根据该线程中的一条评论,可能存在与NaN值相关的问题,这可能会导致问题。

https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed

英文:

Partial answer, will work for instances where you copy to nan values

I recently found half a solution to this and wanted to put up that for everyone else who is trying to figure this out.

I solved it for instances where you copy to nan values.

In the example below I have column 'Name', which is missing some names. Then in column 'Correction' I have values that I have picked up from somewhere else and want to fill into the 'Name' column if there are no names.


    df = pd.DataFrame({'Name': ['A', None, 'C'],               
                  'Correction': [None, 'B', 'Q']})

out:
   Name Correction
0     A  None
1  None     B
2     C     Q

    # Copy values from column 'Correction' to 'Name' for rows where 'Name' is None
    df.loc[:, 'Name'].fillna(df[(df['Name'].isna()==True) & (df['Correction'].isna() == False)]['Correction'], 
                              inplace=True)

out:
  Name Correction
0    A  None
1    B     B
2    C     Q

For non nan values, a improvement but not a solution

For non nan cases I have found an improvement as seen in the link below.

But as per a comment in the that threat it might have issues with nan values which would pose a problem.

https://datascience.stackexchange.com/questions/23264/improve-pandas-dataframe-filtering-speed

huangapple
  • 本文由 发表于 2023年2月23日 22:38:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75546291.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定