2023年2月24日 00:21:05go评论108阅读模式

英文:

Creating a counter from columns in a dataset in Python

问题

我之前提出了这个问题，但没有正确提问，所以我会再试一次。我有一个使用pandas和numpy的数据集，主要是尝试获取包含员工ID（也包括经理ID）的列：

员工ID     经理ID 
   1          3
   2          3
   3          5      
   4          3
   5          7
   6          7 
   7          10
   8          3
   9          7
   10         -

我有类似这样的数据，我想要员工的经理ID，或者不是经理的员工，以显示有多少人向他们汇报，并且我想要一个像这样的计数器：

员工ID     经理ID     员工计数
   1         3            0
   2         3            0 
   3         5            4
   4         3            0
   5         7            1
   6         7            0 
   7         10           3
   8         3            0
   9         7            0
   10        -            1

我的代码看起来是这样的：

df['员工计数'] = df.groupby('经理ID').transform('count')

当我运行这个代码时，我收到了一个错误：

值错误：无法将包含多列的数据框设置为单列'员工计数'。

我不确定我做错了什么，我对这方面还是相当新手。

英文:

So I asked this question before and didn't ask it correctly so I will try it again I have a dataset that uses pandas and numpy mainly I am trying to take columns with Employee IDs (Also includes managers' IDs):

emp_id      mgr_id 
   1          3
   2          3
   3          5      
   4          3
   5          7
   6          7 
   7          10
   8          3
   9          7
   10         -

I have something similar to this where I want the ID of the employee manager or not to show how many people report to them and I want a counter like this:

emp_id     mgr_id     emp_count
   1         3            0
   2         3            0 
   3         5            4
   4         3            0
   5         7            1
   6         7            0 
   7         10           3
   8         3            0
   9         7            0
   10        -            1

my code looks like this:

 df[&#39;emp_count&#39;] = df.groupby(&#39;mgr_id&#39;).transform(&#39;count&#39;)

when I run this I get an error of:

> Value Error: Cannot set a dataframe with multiple columns to the single column emp_count.

I am not sure what I am doing wrong I am pretty novice to this.

答案1

得分: 2

让我们来做

# 统计报告给经理的独特员工数量
counts = df['emp_id'].groupby(df['mgr_id'].astype(str)).nunique()
# 将计数映射到 emp_id 列
df['emp_count'] = df['emp_id'].astype(str).map(counts).fillna(0, downcast='infer')

结果

   emp_id mgr_id  emp_count
0       1      3          0
1       2      3          0
2       3      5          4
3       4      3          0
4       5      7          1
5       6      7          0
6       7     10          3
7       8      3          0
8       9      7          0
9      10      -          1

英文:

Lets do

# Count of unique employee reporting to a manager
counts = df[&#39;emp_id&#39;].groupby(df[&#39;mgr_id&#39;].astype(str)).nunique()
# Map the counts to emp_id column
df[&#39;emp_count&#39;] = df[&#39;emp_id&#39;].astype(str).map(counts).fillna(0, downcast=&#39;infer&#39;)

Result

   emp_id mgr_id  emp_count
0       1      3          0
1       2      3          0
2       3      5          4
3       4      3          0
4       5      7          1
5       6      7          0
6       7     10          3
7       8      3          0
8       9      7          0
9      10      -          1

答案2

得分: 0

另一种方法，但不如 Shubham 的答案那样优雅

df['emp_id'] = df['emp_id'].astype(str)
df['mgr_id'] = df['mgr_id'].astype(str)
df1 = df.groupby('mgr_id').count().reset_index().rename(columns={'emp_id': 'count', 'mgr_id': 'emp_id'})
df2 = df.merge(df1, how='left', on='emp_id')
df2['count'] = df2['count'].fillna(0)

结果

   emp_id mgr_id  count
0      1      3    0
1      2      3    0
2      3      5    4
3      4      3    0
4      5      7    1
5      6      7    0
6      7     10    3
7      8      3    0
8      9      7    0
9     10      -    1

英文:

Another approch, but not as elegant as Shubham's answer

    df[&#39;emp_id&#39;] = df[&#39;emp_id&#39;].astype(str)
    df[&#39;mgr_id&#39;] = df[&#39;mgr_id&#39;].astype(str)
    df1 = df.groupby(&#39;mgr_id&#39;).count().reset_index().rename(columns={&#39;emp_id&#39;: &#39;count&#39;, &#39;mgr_id&#39;: &#39;emp_id&#39;})
    df2 = df.merge(df1, how=&#39;left&#39;, on=&#39;emp_id&#39;)
    df2[&#39;count&#39;] = df2[&#39;count&#39;].fillna(0)

Result

enter code here
    emp_id mgr_id  count
0      1      3    0
1      2      3    0
2      3      5    4
3      4      3    0
4      5      7    1
5      6      7    0
6      7     10    3
7      8      3    0
8      9      7    0
9     10      -    1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Python中从数据集中的列创建计数器

问题

答案1

答案2

如何在Taipy图表中从浅色模式切换到深色模式？

这是什么功能性编程范式。

数据框的百分比变化

Python客户端发送的GRPC请求在Golang服务器端没有正确接收到。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。