2023年2月27日 16:32:09go评论176阅读模式

英文:

How to update a row value in a itertuples

问题

我有一个数据框，我想根据名字对它进行分组。一旦分组，我想遍历每个组的每一行，并更新一个列的值，然后执行其他操作。

问题在于，当我更新一行时，数据框中的行值确实被更新了，但行对象仍然没有被更新。

例如，在这种情况下，df_group.Age的值输出为25，这是更新后的值，但row.Age的值输出为20，这是未更新的值。我如何使row.Age的值在同一次迭代中更新，以便我可以继续使用更新后的row.Age值？

import pandas as pd

data = {'Name': ['A', 'B', 'C', 'D', 'A', 'B', 'D'],
        'Age': [20, 21, 19, 18, 21, 19, 18],
        'Size': [7, 7, 9, 8, 7, 9, 8]}
df = pd.DataFrame(data).sort_values(by='Name').reset_index(drop=True)

df['New_age'] = 0

df_grouped = df.groupby(['Name'])

for group_name, df_group in df_grouped:
    for row in df_group.itertuples():
        if row.Age == 20:
            df_group.at[row.Index, 'Age'] = 25
        print(df_group.Age)
        print(row.Age)

        # 使用值为25的row.Age进行操作

英文:

I have a dataframe which I want to group based on the name. Once grouped, I want to go through each row of each group and update the values of a column to then do other operations.

The problem is that when I update a row, the value of the row is updated in the dataframe, but the row object is still not updated.

For example, in this case the value of df_group.Age outputs 25 which is the updated value but the value of row.Age outputs the value 20 which is the value not updated. How can I make the row.Age value update in that same iteration so that I can continue using the updated row.Age value?

import pandas as pd

data = {&#39;Name&#39;: [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;, &#39;A&#39;, &#39;B&#39;, &#39;D&#39;],
        &#39;Age&#39;: [20, 21, 19, 18, 21, 19, 18],
        &#39;Size&#39;: [7, 7, 9, 8, 7, 9, 8]}
df = pd.DataFrame(data).sort_values(by=&#39;Name&#39;).reset_index(drop=True)

df[&#39;New_age&#39;] = 0

df_grouped = df.groupby([&#39;Name&#39;])

for group_name, df_group in df_grouped:
    for row in df_group.itertuples():
        if row.Age == 20:
            df_group.at[row.Index, &#39;Age&#39;] = 25
        print(df_group.Age)
        print(row.Age)

        #Do things with the row.Age value = 25

答案1

得分: 2

row.Age的值在itertuples循环中未更新，因为行对象是命名元组并且是不可变的。
要实现您想要的效果，需要使用df.loc访问器来更新DataFrame中的值，然后从DataFrame中检索更新后的值：

for group_name, df_group in df_grouped:
    for row in df_group.itertuples():
        if row.Age == 20:
            df.loc[row.Index, 'Age'] = 25
            row = row._replace(Age=25)  # 更新命名元组
        print(df_group.Age)
        print(row.Age)

英文:

row.Age value is not updated in the itertuples loop is because the row object is a named tuple and it is immutable.
To achieve what you want is to use the df.loc accessor to update the value in the DataFrame and then retrieve the updated value from the DataFrame:

for group_name, df_group in df_grouped:
    for row in df_group.itertuples():
        if row.Age == 20:
            df.loc[row.Index, &#39;Age&#39;] = 25
            row = row._replace(Age=25)  # update the named tuple
        print(df_group.Age)
        print(row.Age)

答案2

得分: 1

Do you need update original DataFrame ? Then instead df_group use df.

df.at[row.Index, 'Age'] = 25

I suggest avoid looping in pandas, best is vectorize if possible like here.

If need looping in groups and processing values per groups use custom function:

def f(x):
    print (x)
    #processing
    #x.loc[x.Age == 20, 'Age'] = 25
    #x['new'] = 'ouutput of processing'
    return x

df1 = df.groupby(['Name']).apply(f)

英文:

Do you need update original DataFrame ? Then instead df_group use df.

df.at[row.Index, &#39;Age&#39;] = 25

I suggest avoid looping in pandas, best is vectorize if possible like here.

If need looping in groups and processing values per groups use custom function:

def f(x):
    print (x)
    #processing
    #x.loc[x.Age == 20, &#39;Age&#39;] = 25
    #x[&#39;new&#39;] = &#39;ouutput of processing&#39;
    return x

df1 = df.groupby([&#39;Name&#39;]).apply(f)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在itertuples中更新行值

问题

答案1

答案2

哪个更好，多进程还是子进程适合这个ping脚本？

返回数据集作为 Postman 请求的表格。

这个问题可以用动态规划进行优化吗？

检查连续日期之间满足相同条件的 N 个列，并返回每个组的列数和ID。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论