如何在“groupby”之后/之间添加多行的数值?

huangapple go评论59阅读模式
英文:

How to add values of multiple rows during/after "groupby"?

问题

我有一个作业数据集,其中包括按学术小组排列的学生成绩,用于各种模块。在我的作业中,我使用以下代码来获取一个输出,然后将其硬编码到表中:

math_comparison_size = data.groupby(["Academic Group","Math.SemGrade"]).size()

如何在“groupby”之后/之间添加多行的数值?

是否有办法将某些行组合/合并在一起以增加它们的值?

比如:

A(将每个学术小组的A和A+值组合在一起,例如,对于A组,它将是28,而不是7和21)

依此类推...

英文:

I have a assignment dataset which consists of students' grades sorted by academic groups for various modules. In my assignment, I used the following code to acquire an output which I later hardcoded it into a table:

math_comparison_size = data.groupby(["Academic Group","Math.SemGrade"]).size()

如何在“groupby”之后/之间添加多行的数值?

Is there a way where I can combine/merge certain rows together to increase their values?

Such as:

A (combination of A and A+ values for each academic group E.g. for Group A it will be 28 instead of 7 and 21)

so on and forth...

答案1

得分: 1

更新

>>> df.unstack(level=0).groupby(df.index.levels[1].map(mapping)).sum()

学术小组  Grp A  Grp B  Grp C  Grp D  Grp E
数学学期成绩                                    
A             43     49     93     82     39
B             24     69     65     59     57
C             20      8      5     23     13

输入数据:

学术小组  数学学期成绩
Grp A   A                13
        A+               17
        AD                6
        B                 1
        B+                6
        C                 9
        C+                0
        D                 9
        D+                6
        F                20
Grp B   A                 6
        A+                8
        AD               19
        B                 6
        B+               10
        C                24
        C+                5
        D                11
        D+               29
        F                 8
Grp C   A                22
        A+               27
        AD               16
        B                14
        B+               14
        C                26
        C+               14
        D                 0
        D+               25
        F                 5
Grp D   A                29
        A+               23
        AD                2
        B                11
        B+               17
        C                 1
        C+               27
        D                 3
        D+               28
        F                23
Grp E   A                 2
        A+                9
        AD                4
        B                 9
        B+               15
        C                18
        C+               10
        D                 5
        D+               24
        F                13
dtype: int64
英文:

Update

>>> df.unstack(level=0).groupby(df.index.levels[1].map(mapping)).sum()

Academic Group  Grp A  Grp B  Grp C  Grp D  Grp E
Math.SemGrade                                    
A                  43     49     93     82     39
B                  24     69     65     59     57
C                  20      8      5     23     13

Input data:

Academic Group  Math.SemGrade
Grp A           A                13
                A+               17
                AD                6
                B                 1
                B+                6
                C                 9
                C+                0
                D                 9
                D+                6
                F                20
Grp B           A                 6
                A+                8
                AD               19
                B                 6
                B+               10
                C                24
                C+                5
                D                11
                D+               29
                F                 8
Grp C           A                22
                A+               27
                AD               16
                B                14
                B+               14
                C                26
                C+               14
                D                 0
                D+               25
                F                 5
Grp D           A                29
                A+               23
                AD                2
                B                11
                B+               17
                C                 1
                C+               27
                D                 3
                D+               28
                F                23
Grp E           A                 2
                A+                9
                AD                4
                B                 9
                B+               15
                C                18
                C+               10
                D                 5
                D+               24
                F                13
dtype: int64

Old answer

If you want to group by the first letter (A -> A, A+ -> A, ...), you can use:

>>> (df.groupby(df['Maths Semester Grades'].str[0])
       .sum(numeric_only=True).reset_index())

  Maths Semester Grades  Grp A  Grp B  Grp C  Grp D  Grp E
0                     A     36     33     65     54     15
1                     B      7     16     28     28     24
2                     C      9     29     40     28     28
3                     D     15     40     25     31     29
4                     F     20      8      5     23     13

If you want to control the groups, use a mapping dict:

mapping = {'A': 'A', 'A+': 'A', 'AD': 'A', 'B': 'A', 'B+': 'A',
           'C': 'B', 'C+': 'B', 'D': 'B', 'D+': 'B',
           'E': 'C', 'F': 'C'}

>>> (df.groupby(df['Maths Semester Grades'].map(mapping))
       .sum(numeric_only=True).reset_index()

  Maths Semester Grades  Grp A  Grp B  Grp C  Grp D  Grp E
0                     A     43     49     93     82     39
1                     B     24     69     65     59     57
2                     C     20      8      5     23     13

Input dataframe:

>>> df
  Maths Semester Grades  Grp A  Grp B  Grp C  Grp D  Grp E
0                     A     13      6     22     29      2
1                    A+     17      8     27     23      9
2                    AD      6     19     16      2      4
3                     B      1      6     14     11      9
4                    B+      6     10     14     17     15
5                     C      9     24     26      1     18
6                    C+      0      5     14     27     10
7                     D      9     11      0      3      5
8                    D+      6     29     25     28     24
9                     F     20      8      5     23     13

huangapple
  • 本文由 发表于 2023年1月6日 15:07:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027953.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定