英文:
Creating a counter from columns in a dataset in Python
问题
我之前提出了这个问题,但没有正确提问,所以我会再试一次。我有一个使用pandas和numpy的数据集,主要是尝试获取包含员工ID(也包括经理ID)的列:
员工ID 经理ID
1 3
2 3
3 5
4 3
5 7
6 7
7 10
8 3
9 7
10 -
我有类似这样的数据,我想要员工的经理ID,或者不是经理的员工,以显示有多少人向他们汇报,并且我想要一个像这样的计数器:
员工ID 经理ID 员工计数
1 3 0
2 3 0
3 5 4
4 3 0
5 7 1
6 7 0
7 10 3
8 3 0
9 7 0
10 - 1
我的代码看起来是这样的:
df['员工计数'] = df.groupby('经理ID').transform('count')
当我运行这个代码时,我收到了一个错误:
值错误:无法将包含多列的数据框设置为单列'员工计数'。
我不确定我做错了什么,我对这方面还是相当新手。
英文:
So I asked this question before and didn't ask it correctly so I will try it again I have a dataset that uses pandas and numpy mainly I am trying to take columns with Employee IDs (Also includes managers' IDs):
emp_id mgr_id
1 3
2 3
3 5
4 3
5 7
6 7
7 10
8 3
9 7
10 -
I have something similar to this where I want the ID of the employee manager or not to show how many people report to them and I want a counter like this:
emp_id mgr_id emp_count
1 3 0
2 3 0
3 5 4
4 3 0
5 7 1
6 7 0
7 10 3
8 3 0
9 7 0
10 - 1
my code looks like this:
df['emp_count'] = df.groupby('mgr_id').transform('count')
when I run this I get an error of:
> Value Error: Cannot set a dataframe with multiple columns to the single column emp_count.
I am not sure what I am doing wrong I am pretty novice to this.
答案1
得分: 2
让我们来做
# 统计报告给经理的独特员工数量
counts = df['emp_id'].groupby(df['mgr_id'].astype(str)).nunique()
# 将计数映射到 emp_id 列
df['emp_count'] = df['emp_id'].astype(str).map(counts).fillna(0, downcast='infer')
结果
emp_id mgr_id emp_count
0 1 3 0
1 2 3 0
2 3 5 4
3 4 3 0
4 5 7 1
5 6 7 0
6 7 10 3
7 8 3 0
8 9 7 0
9 10 - 1
英文:
Lets do
# Count of unique employee reporting to a manager
counts = df['emp_id'].groupby(df['mgr_id'].astype(str)).nunique()
# Map the counts to emp_id column
df['emp_count'] = df['emp_id'].astype(str).map(counts).fillna(0, downcast='infer')
Result
emp_id mgr_id emp_count
0 1 3 0
1 2 3 0
2 3 5 4
3 4 3 0
4 5 7 1
5 6 7 0
6 7 10 3
7 8 3 0
8 9 7 0
9 10 - 1
答案2
得分: 0
另一种方法,但不如 Shubham 的答案那样优雅
df['emp_id'] = df['emp_id'].astype(str)
df['mgr_id'] = df['mgr_id'].astype(str)
df1 = df.groupby('mgr_id').count().reset_index().rename(columns={'emp_id': 'count', 'mgr_id': 'emp_id'})
df2 = df.merge(df1, how='left', on='emp_id')
df2['count'] = df2['count'].fillna(0)
结果
emp_id mgr_id count
0 1 3 0
1 2 3 0
2 3 5 4
3 4 3 0
4 5 7 1
5 6 7 0
6 7 10 3
7 8 3 0
8 9 7 0
9 10 - 1
英文:
Another approch, but not as elegant as Shubham's answer
df['emp_id'] = df['emp_id'].astype(str)
df['mgr_id'] = df['mgr_id'].astype(str)
df1 = df.groupby('mgr_id').count().reset_index().rename(columns={'emp_id': 'count', 'mgr_id': 'emp_id'})
df2 = df.merge(df1, how='left', on='emp_id')
df2['count'] = df2['count'].fillna(0)
Result
enter code here
emp_id mgr_id count
0 1 3 0
1 2 3 0
2 3 5 4
3 4 3 0
4 5 7 1
5 6 7 0
6 7 10 3
7 8 3 0
8 9 7 0
9 10 - 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论