排序 pd.DataFrame

huangapple go评论83阅读模式
英文:

Sorting pd.DataFrame

问题

  1. # 使用以下代码来创建新的 DataFrame,其中列代表每个 'Num' 组的统计信息:
  2. df_new = df.groupby('Num').agg({
  3. 'Val': ['first', 'last', 'min', 'max']
  4. }).reset_index()
  5. # 重命名列名
  6. df_new.columns = ['Num', 'Val First', 'Val Last', 'Val Min', 'Val Max']
  7. # 打印结果
  8. print(df_new)

这将生成你期望的输出:

  1. Num Val First Val Last Val Min Val Max
  2. 0 1 188 386 188 386
  3. 1 2 111 812 111 812
  4. 2 3 936 554 121 936

这段代码将为每个 'Num' 组计算第一个值(Val First)、最后一个值(Val Last)、最小值(Val Min)和最大值(Val Max),并将它们放入一个新的 DataFrame 中。

英文:

I have a DataFrame as follows:

  1. import pandas as pd
  2. data = [
  3. [1, 188],
  4. [1, 258],
  5. [1, 386],
  6. [1, 385],
  7. [1, 386],
  8. [2, 111],
  9. [2, 253],
  10. [2, 812],
  11. [3, 936],
  12. [3, 121],
  13. [3, 273],
  14. [3, 554],
  15. ]
  16. df = pd.DataFrame(data, columns=['Num', 'Val'])
  17. print(df)

What would be the best way to create a new DF in which the columns represent the following statistics for each 'Num' group:

  1. Val First - the first value of a certain Num in the list;
  2. Val Last - the last value of a certain Num in the list;
  3. Val Min - the minimum value of a certain Num in the list;
  4. Val Max - the maximum value of a certain Num in the list.

Expecting output:

  1. df_new = pd.DataFrame({
  2. 'Num': [1, 2, 3],
  3. 'Val First': [188, 111, 936],
  4. 'Val Last': [386, 812, 554],
  5. 'Val Min': [188, 111, 121],
  6. 'Val Max': [386, 812, 936]
  7. })
  8. df_new.columns = ["Num", "Val First", "Val Last", "Val Min", "Val Max"]
  9. print(df_new)

I will be grateful for your help and maybe it will help other people learn to work with pandas faster...

I tried to manage this by using:

  1. df_new = df.groupby('Num').agg({'Val': ['min', 'max']})

To find Val Min and Val Max for each Num group, but I can't figure out how to determine the standing on the edges elements for each group (Val First and Val Last).

答案1

得分: 1

使用df.groupby().agg()并分配新列名:

  1. df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
  2. df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
  3. df_new.reset_index(inplace=True)
  4. print(df_new)

  1. Num Val First Val Last Val Min Val Max
  2. 0 1 188 386 188 386
  3. 1 2 111 812 111 812
  4. 2 3 936 554 121 936

你还可以使用**方法在agg()中分配列名:

  1. df_new = df.groupby('Num').agg(
  2. **{
  3. 'Val First': ('Val', 'first'),
  4. 'Val Last': ('Val', 'last'),
  5. 'Val Min': ('Val', 'min'),
  6. 'Val Min': ('Val', 'max')
  7. }).reset_index()
  8. print(df_new)

请参考这里获取更多信息。

英文:

Using df.groupby().agg() and assigning new column names

  1. df_new = df.groupby('Num').agg({'Val': ['first', 'last', 'min', 'max']})
  2. df_new.columns = ['Val First', 'Val Last', 'Val Min', 'Val Max']
  3. df_new.reset_index(inplace=True)
  4. print(df_new)

  1. Num Val First Val Last Val Min Val Max
  2. 0 1 188 386 188 386
  3. 1 2 111 812 111 812
  4. 2 3 936 554 121 936

You can also use the ** approach to assign the columns in agg()

  1. df_new = df.groupby('Num').agg(
  2. **{
  3. 'Val First': ('Val', 'first'),
  4. 'Val Last': ('Val', 'last'),
  5. 'Val Min': ('Val', 'min'),
  6. 'Val Min': ('Val', 'max')
  7. }).reset_index()
  8. print(df_new)

huangapple
  • 本文由 发表于 2023年3月4日 02:54:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/75630868.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定