2023年6月26日 11:05:53go评论73阅读模式

英文:

Counting the number of rows based on the value of N number of columns

问题

我有一个类似这样的数据集：

Col1    Col2    Col3
A        100     100
A         0       0
A         0      100
B        100      0
C        100     100
C        100     100

我想要统计基于A B和C中出现100（或任何其他大于零的值）的行数，结果如下：

    Col2_counts   Col3_counts
A       1           2
B       1           0
C       2           2

这样我就可以计算Col2和Col3中A B C的总百分比等等。

我尝试过使用df.groupby(['Col1', 'Col2', 'Col3']).transform('count')，但它没有给我想要的结果。

英文:

I have a dataset that looks like this:

Col1    Col2    Col3
A        100     100
A         0       0
A         0      100
B        100      0
C        100     100
C        100     100

I want to count the number of rows with 100 (or any other values greater than zero) based on A B and C

which will result to this:

    Col2_counts   Col3_counts
A       1           2
B       1           0
C       2           2

so I can calculate the total percentage of A B C in Col2 and Col3 etc.

I tried df.groupby(['Col1', 'Col 2', 'Col3']).transform ('count'), but it doesn't give me the desired result.

答案1

得分: 2

df.set_index('Col1').gt(0).groupby(level=0).sum()

output:

Col2    Col3

Col1
A 1 2
B 1 0
C 2 2

英文:

df.set_index(&#39;Col1&#39;).gt(0).groupby(level=0).sum()

output:

	    Col2	Col3
Col1		
A     	1	    2
B	    1	    0
C	    2	    2

答案2

得分: 0

以下是已翻译的内容：

(
    df
    .set_index("Col1")
    .eq(100)
    .groupby("Col1")
    .sum()
    .add_suffix("_counts")
)

如chrslg指出，Panda Kim的答案评估行数> 0而不是行数== 100。此外，您可以在末尾添加"_counts"后缀以获取您示例中的列名。

以下是每个操作的逐步解释：

将"Col1"设置为DataFrame的索引。

评估单元格是否等于100。

	  Col2	  Col3
Col1		
A	  True	  True
A	  False	  False
A	  False	  True
B	  True	  False
C	  True	  True
C	  True	  True

按"Col1"索引分组并计算"True"值的数量。使用"sum"操作有效，因为它将"True"转换为1，"False"转换为0。

对所有列添加"_counts"后缀。

	  Col2_counts	Col3_counts
Col1		
A	      1	             2
B	      1	             0
C	      2	             2

英文:

The following block produces your expected output:

(
    df
    .set_index(&quot;Col1&quot;)
    .eq(100)
    .groupby(&quot;Col1&quot;)
    .sum()
    .add_suffix(&quot;_counts&quot;)
)

As chrslg pointed out, Panda Kim's answer evaluates rows > 0 instead of rows == 100. Also, you can add the "_counts" suffix at the end to get the column names from your example.

Here's a the step-by-step explanation of each operation:

Set "Col1" as the DataFrame's index.

Evaluate if cells are equal (eq) to 100.

	  Col2	  Col3
Col1		
A	  True	  True
A	  False	  False
A	  False	  True
B	  True	  False
C	  True	  True
C	  True	  True

Group by the "Col1" index and count True values. Using the sum operation works because it converts True to 1 and False to 0.

Add a "_counts" suffix to all columns.

	  Col2_counts	Col3_counts
Col1		
A	      1	             2
B	      1	             0
C	      2	             2

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据N列的值计算行数。

问题

答案1

答案2

如何重塑我的数据，以便具有多个观察的ID按ID的所有可能观察对进行分组？

What's the equivalent of Flask's @before_first_request in golang?

使用请求混合表单和文件，带注释和可选字段。

如何使用Python检索OpenShift容器Pod的名称？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论