2020年1月3日 23:33:11go评论121阅读模式

英文:

Check certain columns' values when using groupby in Pandas

问题

我有一个类似这样的数据框：

df = pd.DataFrame({'Name': ['Bob', 'Bob', 'Bob', 'Joe', 'Joe', 'Joe'],
                   'ID': [1, 2, 3, 4, 5, 6],
                   'Value': [1, 1, 1, 0, 0, 1]})
df

目标是计算一个名为 result 的列。这是通过检查name列中的每个分组来完成的，即Bob和Joe。

因此，对于每个分组，如果value列中的值都是1，那么该分组的result列值将全部为1。如果值全为0，那么该分组的result列值将全部为0。如果值是1和0的混合，则该分组的result列值将全部为0。

因此，输出应如下所示：

Name    ID    Value    Result
 Bob     1       1       1
 Bob     2       1       1
 Bob     3       1       1
 Joe     4       0       0
 Joe     5       0       0
 Joe     6       1       0

难点在于创建这些分组，然后检查每个分组。

我的尝试：

df = df.groupby('Name')
df['Result'] = df.apply(lambda x: x['Value'])

英文:

I have a dataframe like this

df = pd.DataFrame({&#39;Name&#39;: [&#39;Bob&#39;, &#39;Bob&#39;, &#39;Bob&#39;, &#39;Joe&#39;, &#39;Joe&#39;, &#39;Joe&#39;],
                &#39;ID&#39;: [1,2,3,4,5,6],
                &#39;Value&#39;: [1,1,1,0,0,1]})
df
 Name    ID    Value   
 Bob     1       1          
 Bob     2       1          
 Bob     3       1          
 Joe     4       0          
 Joe     5       0          
 Joe     6       1

The goal is to compute a result column. This is done by checking each group in the name column, in this case Bob & Joe.

So for each group, if the values in the value column are all 1, the values in the result column for that group will be all 1. If the values are all 0, the result column values for that group will be all 0. And if the values are a mix of 1 and 0, the result column for that group will be all 0.

So the output should look like this:

Name    ID    Value    Result
 Bob     1       1       1   
 Bob     2       1       1   
 Bob     3       1       1   
 Joe     4       0       0   
 Joe     5       0       0   
 Joe     6       1       0

The difficulty is creating these groups and then checking each one.

My attempt:

df = df.groupby(&#39;Name&#39;)
df[&#39;Result&#39;] = df.apply(lambda x: x[&#39;Value&#39;])

答案1

得分: 4

使用groupby+transform与all：

df['Result'] = df.groupby('Name')['Value'].transform('all').astype(int)
# 或者 df['Result'] = df['Value'].eq(1).groupby(df['Name']).transform('all').astype(int)
print(df)

      Name  ID  Value  Result
    0  Bob   1      1       1
    1  Bob   2      1       1
    2  Bob   3      1       1
    3  Joe   4      0       0
    4  Joe   5      0       0
    5  Joe   6      1       0

英文:

Use all with groupby+transform:

df[&#39;Result&#39;] = df.groupby(&#39;Name&#39;)[&#39;Value&#39;].transform(&#39;all&#39;).astype(int)
# or df[&#39;Result&#39;] = df[&#39;Value&#39;].eq(1).groupby(df[&#39;Name&#39;]).transform(&#39;all&#39;).astype(int)
print(df)

  Name  ID  Value  Result
0  Bob   1      1       1
1  Bob   2      1       1
2  Bob   3      1       1
3  Joe   4      0       0
4  Joe   5      0       0
5  Joe   6      1       0

答案2

得分: 2

df['Result']=df.groupby('Name').Value.all().reindex(df.Name).astype(int).values
df
Out[57]:
Name ID Value Result
0 Bob 1 1 1
1 Bob 2 1 1
2 Bob 3 1 1
3 Joe 4 0 0
4 Joe 5 0 0
5 Joe 6 1 0

英文:

IIUC

df[&#39;Result&#39;]=df.groupby(&#39;Name&#39;).Value.all().reindex(df.Name).astype(int).values
df
Out[57]: 
  Name  ID  Value  Result
0  Bob   1      1       1
1  Bob   2      1       1
2  Bob   3      1       1
3  Joe   4      0       0
4  Joe   5      0       0
5  Joe   6      1       0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用Pandas的groupby时，检查特定列的值。

问题

答案1

答案2

Python 3D线性插值以增加线条分辨率

我可以在Python中设置Prometheus标签的默认值吗？

Cython: 用不同的参数和签名重写`cinit`函数

Flask Application on Azure App Service throwing "405 Method Not Allowed" error

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。