英文:
Sorting pd.DataFrame
问题
# 使用以下代码来创建新的 DataFrame,其中列代表每个 'Num' 组的统计信息:
df_new = df.groupby('Num').agg({
'Val': ['first', 'last', 'min', 'max']
}).reset_index()
# 重命名列名
df_new.columns = ['Num', 'Val First', 'Val Last', 'Val Min', 'Val Max']
# 打印结果
print(df_new)
这将生成你期望的输出:
Num Val First Val Last Val Min Val Max
0 1 188 386 188 386
1 2 111 812 111 812
2 3 936 554 121 936
这段代码将为每个 'Num' 组计算第一个值(Val First)、最后一个值(Val Last)、最小值(Val Min)和最大值(Val Max),并将它们放入一个新的 DataFrame 中。
英文:
I have a DataFrame as follows:
import pandas as pd
data = [
[1, 188],
[1, 258],
[1, 386],
[1, 385],
[1, 386],
[2, 111],
[2, 253],
[2, 812],
[3, 936],
[3, 121],
[3, 273],
[3, 554],
]
df = pd.DataFrame(data, columns=['Num', 'Val'])
print(df)
What would be the best way to create a new DF in which the columns represent the following statistics for each 'Num' group:
Val First - the first value of a certain Num in the list;
Val Last - the last value of a certain Num in the list;
Val Min - the minimum value of a certain Num in the list;
Val Max - the maximum value of a certain Num in the list.
Expecting output:
df_new = pd.DataFrame({
'Num': [1, 2, 3],
'Val First': [188, 111, 936],
'Val Last': [386, 812, 554],
'Val Min': [188, 111, 121],
'Val Max': [386, 812, 936]
})
df_new.columns = ["Num", "Val First", "Val Last", "Val Min", "Val Max"]
print(df_new)
I will be grateful for your help and maybe it will help other people learn to work with pandas faster...
I tried to manage this by using:
df_new = df.groupby('Num').agg({'Val': ['min', 'max']})
To find Val Min and Val Max for each Num group, but I can't figure out how to determine the standing on the edges elements for each group (Val First and Val Last).
答案1
得分: 1
使用df.groupby().agg()
并分配新列名:
df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
df_new.reset_index(inplace=True)
print(df_new)
Num Val First Val Last Val Min Val Max
0 1 188 386 188 386
1 2 111 812 111 812
2 3 936 554 121 936
你还可以使用**
方法在agg()
中分配列名:
df_new = df.groupby('Num').agg(
**{
'Val First': ('Val', 'first'),
'Val Last': ('Val', 'last'),
'Val Min': ('Val', 'min'),
'Val Min': ('Val', 'max')
}).reset_index()
print(df_new)
请参考这里获取更多信息。
英文:
Using df.groupby().agg()
and assigning new column names
df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
df_new.reset_index(inplace=True)
print(df_new)
Num Val First Val Last Val Min Val Max
0 1 188 386 188 386
1 2 111 812 111 812
2 3 936 554 121 936
You can also use the **
approach to assign the columns in agg()
df_new = df.groupby('Num').agg(
**{
'Val First': ('Val', 'first'),
'Val Last': ('Val', 'last'),
'Val Min': ('Val', 'min'),
'Val Min': ('Val', 'max')
}).reset_index()
print(df_new)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论