按层次分组多列排序

huangapple go评论89阅读模式
英文:

Groupby Multi-Column Sort based on hierarchy

问题

我正在尝试按列的层次结构/顺序对groupby的结果进行排序。

数据框示例:

  1. df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
  2. 'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
  3. 'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
  4. 'V':[100,240,110,120,50,60,40,40]})

接着进行groupby操作:

  1. df.groupby(['A','B','C']).agg({'V':'sum'}).sort_values(by=['A','V','B','V'], ascending=[False,False,False,False])

输出结果:

  1. V
  2. A B C
  3. A2 B1 C3 120
  4. B2 C2 60
  5. B1 C2 50
  6. A1 B1 C3 240
  7. B2 C1 140
  8. B1 C1 110
  9. B2 C3 40

然而,我想要的是首先根据A1的总值(与B和C的值无关)对A1进行排序,然后根据B的总值对B进行排序。

因此,输出应该类似于这样:

  1. V
  2. A B C
  3. A1 B1 C3 240
  4. B1 C1 110
  5. B2 C1 140
  6. B2 C3 40
  7. A2 B1 C3 120
  8. B1 C2 50
  9. B2 C2 60

请让我知道是否我解释得清楚。

英文:

I am trying to sort outcome of a groupby by the hierarchy / order of the columns.

Example of dataframe:

  1. df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
  2. 'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
  3. 'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
  4. 'V':[100,240,110,120,50,60,40,40]})

Followed by the groupby operation:

  1. df.groupby(['A','B','C']).agg({'V':'sum'}).sort_values(by=['A','V','B','V'],ascending=[False,False,False,False])

Output:

  1. V
  2. A B C
  3. A2 B1 C3 120
  4. B2 C2 60
  5. B1 C2 50
  6. A1 B1 C3 240
  7. B2 C1 140
  8. B1 C1 110
  9. B2 C3 40

What I want however is A1 should be sorted first based on it's own total value (irrespective of B & C values), followed by B based on it's own total.

So the output should look something like this:

  1. V
  2. A B C
  3. A1 B1 C3 240
  4. B1 C1 110
  5. B2 C1 140
  6. B2 C3 40
  7. A2 B1 C3 120
  8. B1 C2 50
  9. B2 C2 60

Let me know if I could explain it well.

答案1

得分: 1

  1. df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
  2. 'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
  3. 'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
  4. 'V':[100,240,110,120,50,60,40,40]})
  5. aa = df.groupby(['A','B','C']).sum()
  6. print(aa)
  7. """
  8. V
  9. A B C
  10. A1 B1 C1 110
  11. C3 240
  12. B2 C1 140
  13. C3 40
  14. A2 B1 C2 50
  15. C3 120
  16. B2 C2 60
  17. """
  18. t1= (
  19. df.groupby(['A','B','C']).sum().sort_values(['V','C'],ascending=False)
  20. .sort_index(level=[0,1],sort_remaining = False)
  21. )
  22. print(t1)
  23. """
  24. V
  25. A B C
  26. A1 B1 C3 240
  27. C1 110
  28. B2 C1 140
  29. C3 40
  30. A2 B1 C3 120
  31. C2 50
  32. B2 C2 60
  33. """
英文:
  1. df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
  2. 'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
  3. 'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
  4. 'V':[100,240,110,120,50,60,40,40]})
  5. aa = df.groupby(['A','B','C']).sum()
  6. print(aa)
  7. """
  8. V
  9. A B C
  10. A1 B1 C1 110
  11. C3 240
  12. B2 C1 140
  13. C3 40
  14. A2 B1 C2 50
  15. C3 120
  16. B2 C2 60
  17. """
  18. t1= (
  19. df.groupby(['A','B','C']).sum().sort_values(['V','C'],ascending=False)
  20. .sort_index(level=[0,1],sort_remaining = False)
  21. )
  22. print(t1)
  23. """
  24. V
  25. A B C
  26. A1 B1 C3 240
  27. C1 110
  28. B2 C1 140
  29. C3 40
  30. A2 B1 C3 120
  31. C2 50
  32. B2 C2 60
  33. """

答案2

得分: 0

IIUC,您希望同时按ABV列进行排序

  1. out = (df.groupby(['A','B','C']).agg({'V':'sum'})
  2. .sort_values(by=['A','B','V'], ascending=[True,True,False]))
  1. print(out)
  2. V
  3. A B C
  4. A1 B1 C3 240
  5. C1 110
  6. B2 C1 140
  7. C3 40
  8. A2 B1 C3 120
  9. C2 50
  10. B2 C2 60
英文:

IIUC, you want to sort by A, B and V column in the same time

  1. out = (df.groupby(['A','B','C']).agg({'V':'sum'})
  2. .sort_values(by=['A','B','V'], ascending=[True,True,False]))
  1. print(out)
  2. V
  3. A B C
  4. A1 B1 C3 240
  5. C1 110
  6. B2 C1 140
  7. C3 40
  8. A2 B1 C3 120
  9. C2 50
  10. B2 C2 60

huangapple
  • 本文由 发表于 2023年3月1日 12:20:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75599546.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定