2023年4月4日 17:14:38go评论104阅读模式

英文:

Python/Pandas. For loop on multiple dataFrames not working correctly

问题

代码部分不需要翻译，以下是翻译好的内容：

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
我试图使用for循环处理一个数据框的列表（示例显示2个，实际情况要多得多）。

Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.
在循环中引用的数据框中删除列运行正常，但在循环内部使用concat函数不起作用。我希望更新dfs中引用的原始数据框。

UPDATED PROBLEM STATEMENT
更新的问题陈述

Previous examples do not cover this case/ seem to not work.
以前的示例未涵盖此情况/似乎不起作用。

Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working
示例改编自此处：https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)
缩小示例后，得到以下内容（部分代码借用自另一个问题）

（接下来是代码示例，不需要翻译）

英文:

I am trying to process a list of dataframes (example shows 2, reality has much more) in multiple ways using a for loop.
Droping columns in the dataframe referenced in the loop works fine, however, concat doesn't do anything inside the loop. I expect to update the original dataframe referenced in dfs.

UPDATED PROBLEM STATEMENT

Previous examples do not cover this case/ seem to not work.
Example adapted from here: https://stackoverflow.com/questions/50306898/pandas-dataframe-concat-using-for-loop-not-working

Minifying the example leads to the following (code partially borrowed from another question)

import numpy as np
import pandas as pd
data = [[&#39;Alex&#39;,10],[&#39;Bob&#39;,12],[&#39;Clarke&#39;,13]]
data2 = [&#39;m&#39;,&#39;m&#39;,&#39;x&#39;]
A = pd.DataFrame(data, columns=[&#39;Name&#39;,&#39;Age&#39;])
B = pd.DataFrame(data, columns=[&#39;Name&#39;,&#39;Age&#39;])
C = pd.DataFrame(data2, columns=[&#39;Gender&#39;])
#expected result for A:
Anew=pd.DataFrame([[&#39;Alex&#39;,&#39;m&#39;],[&#39;Bob&#39;,&#39;m&#39;],[&#39;Clarke&#39;,&#39;x&#39;]], columns=[&#39;Name&#39;, &#39;Gender&#39;])
dfs = [A,B]
for k, v in enumerate(dfs):
    # The following line works as expected on A an B respectively, inplace is required to actually modify A,B as defined above
    dfs[k]=v.drop(&#39;Age&#39;,axis=1, inplace=True)
    # The following line doesn&#39;t do anything, I was expecting Anew (see above) 
    dfs[k] = pd.concat([v, C], axis=1)
    # The following line prints the expected result within the loop
    print(dfs[k])
# This just shows A, not Anew: To me tha tmeans A was never updated with dfs[k] as I thought it would. 
print(A)

答案1

得分: 1

更新

尝试:

data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])
Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])
dfs = [A, B]
for v in dfs:
    v.drop('Age', axis=1, inplace=True)
    v['Gender'] = C
print(A)
print(Anew)

输出:

>>> A
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x
>>> Anew
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

如果使用 inplace=True，Pandas 不会返回一个DataFrame，所以 dfs 现在为 None：

dfs[k] = v.drop('Age', axis=1, inplace=True)  # &lt;- 移除 inplace=True

尝试:

dfs = [A, B]
for k, v in enumerate(dfs):
    dfs[k] = v.drop('Age', axis=1)
    dfs[k] = pd.concat([v, C], axis=1)
out = pd.concat([A, C], axis=1)

输出:

>>> out
     Name  Age Gender
0    Alex   10      m
1     Bob   12      m
2  Clarke   13      x

英文:

Update

Try:

data = [[&#39;Alex&#39;,10],[&#39;Bob&#39;,12],[&#39;Clarke&#39;,13]]
data2 = [&#39;m&#39;,&#39;m&#39;,&#39;x&#39;]
A = pd.DataFrame(data, columns=[&#39;Name&#39;,&#39;Age&#39;])
B = pd.DataFrame(data, columns=[&#39;Name&#39;,&#39;Age&#39;])
C = pd.DataFrame(data2, columns=[&#39;Gender&#39;])
Anew = pd.DataFrame([[&#39;Alex&#39;,&#39;m&#39;],[&#39;Bob&#39;,&#39;m&#39;],[&#39;Clarke&#39;,&#39;x&#39;]], columns=[&#39;Name&#39;, &#39;Gender&#39;])
dfs = [A, B]
for v in dfs:
    v.drop(&#39;Age&#39;, axis=1, inplace=True)
    v[&#39;Gender&#39;] = C
print(A)
print(Anew)

Output:

&gt;&gt;&gt; A
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x
&gt;&gt;&gt; Anew
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

If you use inplace=True, Pandas doesn't return a DataFrame so dfs is now None:

dfs[k]=v.drop(&#39;Age&#39;, axis=1, inplace=True)  # &lt;- Remove inplace=True

Try:

dfs = [A, B]
for k, v in enumerate(dfs):
    dfs[k] = v.drop(&#39;Age&#39;, axis=1)
    dfs[k] = pd.concat([v, C], axis=1)
out = pd.concat([A, C], axis=1)

Output:

&gt;&gt;&gt; out
     Name  Age Gender
0    Alex   10      m
1     Bob   12      m
2  Clarke   13      x

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python/Pandas. For loop on multiple dataFrames not working correctly.

问题

答案1

使用Selenium Python实现网页的无限滚动。

在Python中复制2D数组

How to add legend for a scatter plot with title and customized labels and position the legend in any way user wants?

尝试在Python中按部门显示计数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。