英文:
How to add values of multiple rows during/after "groupby"?
问题
我有一个作业数据集,其中包括按学术小组排列的学生成绩,用于各种模块。在我的作业中,我使用以下代码来获取一个输出,然后将其硬编码到表中:
math_comparison_size = data.groupby(["Academic Group","Math.SemGrade"]).size()
是否有办法将某些行组合/合并在一起以增加它们的值?
比如:
A(将每个学术小组的A和A+值组合在一起,例如,对于A组,它将是28,而不是7和21)
依此类推...
英文:
I have a assignment dataset which consists of students' grades sorted by academic groups for various modules. In my assignment, I used the following code to acquire an output which I later hardcoded it into a table:
math_comparison_size = data.groupby(["Academic Group","Math.SemGrade"]).size()
Is there a way where I can combine/merge certain rows together to increase their values?
Such as:
A (combination of A and A+ values for each academic group E.g. for Group A it will be 28 instead of 7 and 21)
so on and forth...
答案1
得分: 1
更新
>>> df.unstack(level=0).groupby(df.index.levels[1].map(mapping)).sum()
学术小组 Grp A Grp B Grp C Grp D Grp E
数学学期成绩
A 43 49 93 82 39
B 24 69 65 59 57
C 20 8 5 23 13
输入数据:
学术小组 数学学期成绩
Grp A A 13
A+ 17
AD 6
B 1
B+ 6
C 9
C+ 0
D 9
D+ 6
F 20
Grp B A 6
A+ 8
AD 19
B 6
B+ 10
C 24
C+ 5
D 11
D+ 29
F 8
Grp C A 22
A+ 27
AD 16
B 14
B+ 14
C 26
C+ 14
D 0
D+ 25
F 5
Grp D A 29
A+ 23
AD 2
B 11
B+ 17
C 1
C+ 27
D 3
D+ 28
F 23
Grp E A 2
A+ 9
AD 4
B 9
B+ 15
C 18
C+ 10
D 5
D+ 24
F 13
dtype: int64
英文:
Update
>>> df.unstack(level=0).groupby(df.index.levels[1].map(mapping)).sum()
Academic Group Grp A Grp B Grp C Grp D Grp E
Math.SemGrade
A 43 49 93 82 39
B 24 69 65 59 57
C 20 8 5 23 13
Input data:
Academic Group Math.SemGrade
Grp A A 13
A+ 17
AD 6
B 1
B+ 6
C 9
C+ 0
D 9
D+ 6
F 20
Grp B A 6
A+ 8
AD 19
B 6
B+ 10
C 24
C+ 5
D 11
D+ 29
F 8
Grp C A 22
A+ 27
AD 16
B 14
B+ 14
C 26
C+ 14
D 0
D+ 25
F 5
Grp D A 29
A+ 23
AD 2
B 11
B+ 17
C 1
C+ 27
D 3
D+ 28
F 23
Grp E A 2
A+ 9
AD 4
B 9
B+ 15
C 18
C+ 10
D 5
D+ 24
F 13
dtype: int64
Old answer
If you want to group by the first letter (A -> A, A+ -> A, ...), you can use:
>>> (df.groupby(df['Maths Semester Grades'].str[0])
.sum(numeric_only=True).reset_index())
Maths Semester Grades Grp A Grp B Grp C Grp D Grp E
0 A 36 33 65 54 15
1 B 7 16 28 28 24
2 C 9 29 40 28 28
3 D 15 40 25 31 29
4 F 20 8 5 23 13
If you want to control the groups, use a mapping dict:
mapping = {'A': 'A', 'A+': 'A', 'AD': 'A', 'B': 'A', 'B+': 'A',
'C': 'B', 'C+': 'B', 'D': 'B', 'D+': 'B',
'E': 'C', 'F': 'C'}
>>> (df.groupby(df['Maths Semester Grades'].map(mapping))
.sum(numeric_only=True).reset_index()
Maths Semester Grades Grp A Grp B Grp C Grp D Grp E
0 A 43 49 93 82 39
1 B 24 69 65 59 57
2 C 20 8 5 23 13
Input dataframe:
>>> df
Maths Semester Grades Grp A Grp B Grp C Grp D Grp E
0 A 13 6 22 29 2
1 A+ 17 8 27 23 9
2 AD 6 19 16 2 4
3 B 1 6 14 11 9
4 B+ 6 10 14 17 15
5 C 9 24 26 1 18
6 C+ 0 5 14 27 10
7 D 9 11 0 3 5
8 D+ 6 29 25 28 24
9 F 20 8 5 23 13
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论