Python/Pandas. For loop on multiple dataFrames not working correctly.

huangapple go评论69阅读模式
英文:

Python/Pandas. For loop on multiple dataFrames not working correctly

问题

代码部分不需要翻译,以下是翻译好的内容:

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
我试图使用for循环处理一个数据框的列表(示例显示2个,实际情况要多得多)。

Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.
在循环中引用的数据框中删除列运行正常,但在循环内部使用concat函数不起作用。我希望更新dfs中引用的原始数据框。

UPDATED PROBLEM STATEMENT
更新的问题陈述

Previous examples do not cover this case/ seem to not work.
以前的示例未涵盖此情况/似乎不起作用。

Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working
示例改编自此处:https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)
缩小示例后,得到以下内容(部分代码借用自另一个问题)

(接下来是代码示例,不需要翻译)

英文:

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.

UPDATED PROBLEM STATEMENT

Previous examples do not cover this case/ seem to not work.
Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)

import numpy as np
import pandas as pd


data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])

#expected result for A:
Anew=pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])

dfs = [A,B]

for k, v in enumerate(dfs):
    # The following line works as expected on A an B respectively, inplace is required to actually modify A,B as defined above
    dfs[k]=v.drop('Age',axis=1, inplace=True)
    # The following line doesn't do anything, I was expecting Anew (see above) 
    dfs[k] = pd.concat([v, C], axis=1)
    # The following line prints the expected result within the loop
    print(dfs[k])

# This just shows A, not Anew: To me tha tmeans A was never updated with dfs[k] as I thought it would. 
print(A)

答案1

得分: 1

更新

尝试:

data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])
Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])

dfs = [A, B]
for v in dfs:
    v.drop('Age', axis=1, inplace=True)
    v['Gender'] = C
print(A)
print(Anew)

输出:

>>> A
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

>>> Anew
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

如果使用 inplace=True,Pandas 不会返回一个DataFrame,所以 dfs 现在为 None

dfs[k] = v.drop('Age', axis=1, inplace=True)  # <- 移除 inplace=True

尝试:

dfs = [A, B]
for k, v in enumerate(dfs):
    dfs[k] = v.drop('Age', axis=1)
    dfs[k] = pd.concat([v, C], axis=1)
out = pd.concat([A, C], axis=1)

输出:

>>> out
     Name  Age Gender
0    Alex   10      m
1     Bob   12      m
2  Clarke   13      x
英文:

Update

Try:

data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])
Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])

dfs = [A, B]
for v in dfs:
    v.drop('Age', axis=1, inplace=True)
    v['Gender'] = C
print(A)
print(Anew)

Output:

>>> A
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

>>> Anew
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

If you use inplace=True, Pandas doesn't return a DataFrame so dfs is now None:

dfs[k]=v.drop('Age', axis=1, inplace=True)  # <- Remove inplace=True

Try:

dfs = [A, B]
for k, v in enumerate(dfs):
    dfs[k] = v.drop('Age', axis=1)
    dfs[k] = pd.concat([v, C], axis=1)
out = pd.concat([A, C], axis=1)

Output:

>>> out
     Name  Age Gender
0    Alex   10      m
1     Bob   12      m
2  Clarke   13      x

huangapple
  • 本文由 发表于 2023年4月4日 17:14:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75927565.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定