2023年2月6日 19:05:44go评论97阅读模式

英文:

Count number of occurences in Dataframe per column

问题

用户ID	出现次数
1	2
2	2
3	2

英文:

I have a sample dataframe whereby all numbers are userID:

from	to
1	3
1	2
2	3

How do I count the number of occurrences for each columns, sum it up based on the same values and displays in the following format in a new dataframe?

UserID	Occurences
1	2
2	2
3	2

Thank you.

答案1

得分: 2

IIUC，您可以执行 stack 然后 value_counts：

out = (df.stack().value_counts()
       .to_frame('Occurrences')
       .rename_axis('UserID')
       .reset_index())

print(out)
   UserID  Occurrences
0       1           2
1       2           2
2       3           2

英文:

IIUC, you can stack then value_counts

out = (df.stack().value_counts()
       .to_frame(&#39;Occurences&#39;)
       .rename_axis(&#39;UserID&#39;)
       .reset_index())

print(out)
   UserID  Occurences
0       1           2
1       2           2
2       3           2

答案2

得分: 1

使用 DataFrame.melt 与 GroupBy.size:

df = df.melt(value_name='UserID').groupby('UserID').size().reset_index(name='Occurences')
print(df)
   UserID  Occurences
0       1           2
1       2           2
2       3           2

英文:

Use DataFrame.melt with GroupBy.size:

df = df.melt(value_name=&#39;UserID&#39;).groupby(&#39;UserID&#39;).size().reset_index(name=&#39;Occurences&#39;)
print (df)
   UserID  Occurences
0       1           2
1       2           2
2       3           2

答案3

得分: 0

The pd.Series.value_counts 方法可用于计算“from”和“to”列中每个“userID”的实例数量，pd.concat 可用于合并结果。最后，使用pd.DataFrame.reset_index 方法从生成的系列创建一个数据帧：

import pandas as pd
data_frame = pd.DataFrame({'from': [1, 1, 2], 'to': [3, 2, 3]})
occur = pd.concat([df['from'].value_counts(), df['to'].value_counts()])
result_df = occur.reset_index()
result_df.columns = ['UserID', 'occur']
result_df = result_df.groupby(['UserID'])['occur'].sum().reset_index()
   UserID  occur
0       1      2
1       2      2
2       3      2

英文:

The pd.Series.value counts method may be used to count the instances of each userID in the columns "from" and "to," and pd.concat can be used to combine the results. At the end create a dataframe from the resulting series using the pd.DataFrame.reset index method:

import pandas as pd
data_frame = pd.DataFrame({&#39;from&#39;: [1, 1, 2], &#39;to&#39;: [3, 2, 3]})
occur = pd.concat([df[&#39;from&#39;].value_counts(), df[&#39;to&#39;].value_counts()])
result_df = occur.reset_index()
result_df.columns = [&#39;UserID&#39;, &#39;occur&#39;]
result_df = result_df.groupby([&#39;UserID&#39;])[&#39;occur&#39;].sum().reset_index()
   UserID         Occur
0       1           2
1       2           2
2       3           2

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在数据框中每列的出现次数。

问题

答案1

答案2

答案3

Pandas将.xlsx列读取为日期时间而不是浮点数。

Qpid Proton在Python 3.10中的SSL问题

有没有用于查找config.ini中所有选项的函数？

如何在R中编写循环以操作数据框？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。