Python/Pandas. For loop on multiple dataFrames not working correctly.

huangapple go评论104阅读模式
英文:

Python/Pandas. For loop on multiple dataFrames not working correctly

问题

代码部分不需要翻译,以下是翻译好的内容:

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
我试图使用for循环处理一个数据框的列表(示例显示2个,实际情况要多得多)。

Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.
在循环中引用的数据框中删除列运行正常,但在循环内部使用concat函数不起作用。我希望更新dfs中引用的原始数据框。

UPDATED PROBLEM STATEMENT
更新的问题陈述

Previous examples do not cover this case/ seem to not work.
以前的示例未涵盖此情况/似乎不起作用。

Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working
示例改编自此处:https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)
缩小示例后,得到以下内容(部分代码借用自另一个问题)

(接下来是代码示例,不需要翻译)

英文:

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.

UPDATED PROBLEM STATEMENT

Previous examples do not cover this case/ seem to not work.
Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)

  1. import numpy as np
  2. import pandas as pd
  3. data = [['Alex',10],['Bob',12],['Clarke',13]]
  4. data2 = ['m','m','x']
  5. A = pd.DataFrame(data, columns=['Name','Age'])
  6. B = pd.DataFrame(data, columns=['Name','Age'])
  7. C = pd.DataFrame(data2, columns=['Gender'])
  8. #expected result for A:
  9. Anew=pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])
  10. dfs = [A,B]
  11. for k, v in enumerate(dfs):
  12. # The following line works as expected on A an B respectively, inplace is required to actually modify A,B as defined above
  13. dfs[k]=v.drop('Age',axis=1, inplace=True)
  14. # The following line doesn't do anything, I was expecting Anew (see above)
  15. dfs[k] = pd.concat([v, C], axis=1)
  16. # The following line prints the expected result within the loop
  17. print(dfs[k])
  18. # This just shows A, not Anew: To me tha tmeans A was never updated with dfs[k] as I thought it would.
  19. print(A)

答案1

得分: 1

更新

尝试:

  1. data = [['Alex',10],['Bob',12],['Clarke',13]]
  2. data2 = ['m','m','x']
  3. A = pd.DataFrame(data, columns=['Name','Age'])
  4. B = pd.DataFrame(data, columns=['Name','Age'])
  5. C = pd.DataFrame(data2, columns=['Gender'])
  6. Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])
  7. dfs = [A, B]
  8. for v in dfs:
  9. v.drop('Age', axis=1, inplace=True)
  10. v['Gender'] = C
  11. print(A)
  12. print(Anew)

输出:

  1. >>> A
  2. Name Gender
  3. 0 Alex m
  4. 1 Bob m
  5. 2 Clarke x
  6. >>> Anew
  7. Name Gender
  8. 0 Alex m
  9. 1 Bob m
  10. 2 Clarke x

如果使用 inplace=True,Pandas 不会返回一个DataFrame,所以 dfs 现在为 None

  1. dfs[k] = v.drop('Age', axis=1, inplace=True) # <- 移除 inplace=True

尝试:

  1. dfs = [A, B]
  2. for k, v in enumerate(dfs):
  3. dfs[k] = v.drop('Age', axis=1)
  4. dfs[k] = pd.concat([v, C], axis=1)
  5. out = pd.concat([A, C], axis=1)

输出:

  1. >>> out
  2. Name Age Gender
  3. 0 Alex 10 m
  4. 1 Bob 12 m
  5. 2 Clarke 13 x
英文:

Update

Try:

  1. data = [['Alex',10],['Bob',12],['Clarke',13]]
  2. data2 = ['m','m','x']
  3. A = pd.DataFrame(data, columns=['Name','Age'])
  4. B = pd.DataFrame(data, columns=['Name','Age'])
  5. C = pd.DataFrame(data2, columns=['Gender'])
  6. Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])
  7. dfs = [A, B]
  8. for v in dfs:
  9. v.drop('Age', axis=1, inplace=True)
  10. v['Gender'] = C
  11. print(A)
  12. print(Anew)

Output:

  1. >>> A
  2. Name Gender
  3. 0 Alex m
  4. 1 Bob m
  5. 2 Clarke x
  6. >>> Anew
  7. Name Gender
  8. 0 Alex m
  9. 1 Bob m
  10. 2 Clarke x

If you use inplace=True, Pandas doesn't return a DataFrame so dfs is now None:

  1. dfs[k]=v.drop('Age', axis=1, inplace=True) # <- Remove inplace=True

Try:

  1. dfs = [A, B]
  2. for k, v in enumerate(dfs):
  3. dfs[k] = v.drop('Age', axis=1)
  4. dfs[k] = pd.concat([v, C], axis=1)
  5. out = pd.concat([A, C], axis=1)

Output:

  1. >>> out
  2. Name Age Gender
  3. 0 Alex 10 m
  4. 1 Bob 12 m
  5. 2 Clarke 13 x

huangapple
  • 本文由 发表于 2023年4月4日 17:14:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75927565.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定