按层次分组多列排序

huangapple go评论61阅读模式
英文:

Groupby Multi-Column Sort based on hierarchy

问题

我正在尝试按列的层次结构/顺序对groupby的结果进行排序。

数据框示例:

df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
                   'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
                   'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
                   'V':[100,240,110,120,50,60,40,40]})

接着进行groupby操作:

df.groupby(['A','B','C']).agg({'V':'sum'}).sort_values(by=['A','V','B','V'], ascending=[False,False,False,False])

输出结果:

			V
A	B	C	
A2	B1	C3	120
    B2	C2	60
    B1	C2	50
A1	B1	C3	240
    B2	C1	140
    B1	C1	110
    B2	C3	40

然而,我想要的是首先根据A1的总值(与B和C的值无关)对A1进行排序,然后根据B的总值对B进行排序。

因此,输出应该类似于这样:

			V
A	B	C	
A1	B1	C3	240
    B1	C1	110
    B2	C1	140
    B2	C3	40
A2	B1	C3	120
    B1	C2	50
    B2	C2	60

请让我知道是否我解释得清楚。

英文:

I am trying to sort outcome of a groupby by the hierarchy / order of the columns.

Example of dataframe:

df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
                   'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
                   'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
                   'V':[100,240,110,120,50,60,40,40]})

Followed by the groupby operation:

df.groupby(['A','B','C']).agg({'V':'sum'}).sort_values(by=['A','V','B','V'],ascending=[False,False,False,False])

Output:

			V
A	B	C	
A2	B1	C3	120
    B2	C2	60
    B1	C2	50
A1	B1	C3	240
    B2	C1	140
    B1	C1	110
    B2	C3	40

What I want however is A1 should be sorted first based on it's own total value (irrespective of B & C values), followed by B based on it's own total.

So the output should look something like this:

			V
A	B	C	
A1	B1	C3	240
    B1	C1	110
    B2	C1	140
    B2	C3	40
A2	B1	C3	120
    B1	C2	50
    B2	C2	60

Let me know if I could explain it well.

答案1

得分: 1

df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
                   'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
                   'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
                   'V':[100,240,110,120,50,60,40,40]})

aa = df.groupby(['A','B','C']).sum()
print(aa)

"""
            V
A  B  C      
A1 B1 C1  110
      C3  240
   B2 C1  140
      C3   40
A2 B1 C2   50
      C3  120
   B2 C2   60
"""

t1= (
    df.groupby(['A','B','C']).sum().sort_values(['V','C'],ascending=False)
    .sort_index(level=[0,1],sort_remaining = False)
    )

print(t1)

"""
            V
A  B  C      
A1 B1 C3  240
      C1  110
   B2 C1  140
      C3   40
A2 B1 C3  120
      C2   50
   B2 C2   60
"""
英文:
df = pd.DataFrame({'A':['A1','A1','A1','A2','A2','A2','A1','A1'],
                   'B':['B2','B1','B1','B1','B1','B2','B2','B2'],
                   'C':['C1','C3','C1','C3','C2','C2','C1','C3'],
                   'V':[100,240,110,120,50,60,40,40]})




aa = df.groupby(['A','B','C']).sum()
print(aa)


"""
            V
A  B  C      
A1 B1 C1  110
      C3  240
   B2 C1  140
      C3   40
A2 B1 C2   50
      C3  120
   B2 C2   60

"""


t1= (
    df.groupby(['A','B','C']).sum().sort_values(['V','C'],ascending=False)
    .sort_index(level=[0,1],sort_remaining = False)
    )

print(t1)

"""
            V
A  B  C      
A1 B1 C3  240
      C1  110
   B2 C1  140
      C3   40
A2 B1 C3  120
      C2   50
   B2 C2   60

"""

答案2

得分: 0

IIUC,您希望同时按ABV列进行排序

out = (df.groupby(['A','B','C']).agg({'V':'sum'})
       .sort_values(by=['A','B','V'], ascending=[True,True,False]))
print(out)

            V
A  B  C
A1 B1 C3  240
      C1  110
   B2 C1  140
      C3   40
A2 B1 C3  120
      C2   50
   B2 C2   60
英文:

IIUC, you want to sort by A, B and V column in the same time

out = (df.groupby(['A','B','C']).agg({'V':'sum'})
       .sort_values(by=['A','B','V'], ascending=[True,True,False]))
print(out)

            V
A  B  C
A1 B1 C3  240
      C1  110
   B2 C1  140
      C3   40
A2 B1 C3  120
      C2   50
   B2 C2   60

huangapple
  • 本文由 发表于 2023年3月1日 12:20:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75599546.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定