在Python中从数据集中的列创建计数器

huangapple go评论79阅读模式
英文:

Creating a counter from columns in a dataset in Python

问题

我之前提出了这个问题,但没有正确提问,所以我会再试一次。我有一个使用pandas和numpy的数据集,主要是尝试获取包含员工ID(也包括经理ID)的列:

员工ID     经理ID 
   1          3
   2          3
   3          5      
   4          3
   5          7
   6          7 
   7          10
   8          3
   9          7
   10         -       

我有类似这样的数据,我想要员工的经理ID,或者不是经理的员工,以显示有多少人向他们汇报,并且我想要一个像这样的计数器:

员工ID     经理ID     员工计数
   1         3            0
   2         3            0 
   3         5            4
   4         3            0
   5         7            1
   6         7            0 
   7         10           3
   8         3            0
   9         7            0
   10        -            1

我的代码看起来是这样的:

df['员工计数'] = df.groupby('经理ID').transform('count')

当我运行这个代码时,我收到了一个错误:

值错误:无法将包含多列的数据框设置为单列'员工计数'。

我不确定我做错了什么,我对这方面还是相当新手。

英文:

So I asked this question before and didn't ask it correctly so I will try it again I have a dataset that uses pandas and numpy mainly I am trying to take columns with Employee IDs (Also includes managers' IDs):

emp_id      mgr_id 
   1          3
   2          3
   3          5      
   4          3
   5          7
   6          7 
   7          10
   8          3
   9          7
   10         -       

I have something similar to this where I want the ID of the employee manager or not to show how many people report to them and I want a counter like this:

emp_id     mgr_id     emp_count
   1         3            0
   2         3            0 
   3         5            4
   4         3            0
   5         7            1
   6         7            0 
   7         10           3
   8         3            0
   9         7            0
   10        -            1

my code looks like this:

 df['emp_count'] = df.groupby('mgr_id').transform('count')

when I run this I get an error of:

> Value Error: Cannot set a dataframe with multiple columns to the single column emp_count.

I am not sure what I am doing wrong I am pretty novice to this.

答案1

得分: 2

让我们来做

# 统计报告给经理的独特员工数量
counts = df['emp_id'].groupby(df['mgr_id'].astype(str)).nunique()

# 将计数映射到 emp_id 列
df['emp_count'] = df['emp_id'].astype(str).map(counts).fillna(0, downcast='infer')

结果

   emp_id mgr_id  emp_count
0       1      3          0
1       2      3          0
2       3      5          4
3       4      3          0
4       5      7          1
5       6      7          0
6       7     10          3
7       8      3          0
8       9      7          0
9      10      -          1
英文:

Lets do

# Count of unique employee reporting to a manager
counts = df['emp_id'].groupby(df['mgr_id'].astype(str)).nunique()

# Map the counts to emp_id column
df['emp_count'] = df['emp_id'].astype(str).map(counts).fillna(0, downcast='infer')

Result

   emp_id mgr_id  emp_count
0       1      3          0
1       2      3          0
2       3      5          4
3       4      3          0
4       5      7          1
5       6      7          0
6       7     10          3
7       8      3          0
8       9      7          0
9      10      -          1

答案2

得分: 0

另一种方法,但不如 Shubham 的答案那样优雅

df['emp_id'] = df['emp_id'].astype(str)
df['mgr_id'] = df['mgr_id'].astype(str)
df1 = df.groupby('mgr_id').count().reset_index().rename(columns={'emp_id': 'count', 'mgr_id': 'emp_id'})
df2 = df.merge(df1, how='left', on='emp_id')
df2['count'] = df2['count'].fillna(0)

结果

   emp_id mgr_id  count
0      1      3    0
1      2      3    0
2      3      5    4
3      4      3    0
4      5      7    1
5      6      7    0
6      7     10    3
7      8      3    0
8      9      7    0
9     10      -    1
英文:

Another approch, but not as elegant as Shubham's answer

    df['emp_id'] = df['emp_id'].astype(str)
    df['mgr_id'] = df['mgr_id'].astype(str)
    df1 = df.groupby('mgr_id').count().reset_index().rename(columns={'emp_id': 'count', 'mgr_id': 'emp_id'})
    df2 = df.merge(df1, how='left', on='emp_id')
    df2['count'] = df2['count'].fillna(0)

Result

enter code here

    emp_id mgr_id  count
0      1      3    0
1      2      3    0
2      3      5    4
3      4      3    0
4      5      7    1
5      6      7    0
6      7     10    3
7      8      3    0
8      9      7    0
9     10      -    1

huangapple
  • 本文由 发表于 2023年2月24日 00:21:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/75547531.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定