排序 pd.DataFrame

huangapple go评论71阅读模式
英文:

Sorting pd.DataFrame

问题

# 使用以下代码来创建新的 DataFrame,其中列代表每个 'Num' 组的统计信息:

df_new = df.groupby('Num').agg({
    'Val': ['first', 'last', 'min', 'max']
}).reset_index()

# 重命名列名
df_new.columns = ['Num', 'Val First', 'Val Last', 'Val Min', 'Val Max']

# 打印结果
print(df_new)

这将生成你期望的输出:

   Num  Val First  Val Last  Val Min  Val Max
0    1        188       386      188      386
1    2        111       812      111      812
2    3        936       554      121      936

这段代码将为每个 'Num' 组计算第一个值(Val First)、最后一个值(Val Last)、最小值(Val Min)和最大值(Val Max),并将它们放入一个新的 DataFrame 中。

英文:

I have a DataFrame as follows:

import pandas as pd

data = [
    [1, 188],
    [1, 258],
    [1, 386],
    [1, 385],
    [1, 386],
    [2, 111],
    [2, 253],
    [2, 812],
    [3, 936],
    [3, 121],
    [3, 273],
    [3, 554],
]

df = pd.DataFrame(data, columns=['Num', 'Val'])
print(df)

What would be the best way to create a new DF in which the columns represent the following statistics for each 'Num' group:

 Val First - the first value of a certain Num in the list;
 Val Last - the last value of a certain Num in the list; 
 Val Min - the minimum value of a certain Num in the list;
 Val Max - the maximum value of a certain Num in the list.

Expecting output:

df_new = pd.DataFrame({
    'Num': [1, 2, 3],
    'Val First': [188, 111, 936],
    'Val Last': [386, 812, 554],
    'Val Min': [188, 111, 121],
    'Val Max': [386, 812, 936]
})

df_new.columns = ["Num", "Val First", "Val Last", "Val Min", "Val Max"]

print(df_new)

I will be grateful for your help and maybe it will help other people learn to work with pandas faster...

I tried to manage this by using:

df_new = df.groupby('Num').agg({'Val': ['min', 'max']})

To find Val Min and Val Max for each Num group, but I can't figure out how to determine the standing on the edges elements for each group (Val First and Val Last).

答案1

得分: 1

使用df.groupby().agg()并分配新列名:

df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
df_new.reset_index(inplace=True)

print(df_new)

   Num  Val First  Val Last  Val Min  Val Max
0    1        188       386      188      386
1    2        111       812      111      812
2    3        936       554      121      936

你还可以使用**方法在agg()中分配列名:

df_new = df.groupby('Num').agg(
    **{
       'Val First': ('Val', 'first'),
       'Val Last': ('Val', 'last'),
       'Val Min': ('Val', 'min'),
       'Val Min': ('Val', 'max')
    }).reset_index()

print(df_new)

请参考这里获取更多信息。

英文:

Using df.groupby().agg() and assigning new column names

df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
df_new.reset_index(inplace=True)

print(df_new)

   Num  Val First  Val Last  Val Min  Val Max
0    1        188       386      188      386
1    2        111       812      111      812
2    3        936       554      121      936

You can also use the ** approach to assign the columns in agg()

df_new = df.groupby('Num').agg(
    **{
       'Val First': ('Val', 'first'),
       'Val Last': ('Val', 'last'),
       'Val Min': ('Val', 'min'),
       'Val Min': ('Val', 'max')
    }).reset_index()

print(df_new)

huangapple
  • 本文由 发表于 2023年3月4日 02:54:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/75630868.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定