英文:
Count number of occurences in Dataframe per column
问题
用户ID | 出现次数 |
---|---|
1 | 2 |
2 | 2 |
3 | 2 |
英文:
I have a sample dataframe whereby all numbers are userID:
from | to |
---|---|
1 | 3 |
1 | 2 |
2 | 3 |
How do I count the number of occurrences for each columns, sum it up based on the same values and displays in the following format in a new dataframe?
UserID | Occurences |
---|---|
1 | 2 |
2 | 2 |
3 | 2 |
Thank you.
答案1
得分: 2
IIUC,您可以执行 stack
然后 value_counts
:
out = (df.stack().value_counts()
.to_frame('Occurrences')
.rename_axis('UserID')
.reset_index())
print(out)
UserID Occurrences
0 1 2
1 2 2
2 3 2
英文:
IIUC, you can stack
then value_counts
out = (df.stack().value_counts()
.to_frame('Occurences')
.rename_axis('UserID')
.reset_index())
print(out)
UserID Occurences
0 1 2
1 2 2
2 3 2
答案2
得分: 1
使用 DataFrame.melt
与 GroupBy.size
:
df = df.melt(value_name='UserID').groupby('UserID').size().reset_index(name='Occurences')
print(df)
UserID Occurences
0 1 2
1 2 2
2 3 2
英文:
Use DataFrame.melt
with GroupBy.size
:
df = df.melt(value_name='UserID').groupby('UserID').size().reset_index(name='Occurences')
print (df)
UserID Occurences
0 1 2
1 2 2
2 3 2
答案3
得分: 0
The pd.Series.value_counts
方法可用于计算“from”和“to”列中每个“userID”的实例数量,pd.concat
可用于合并结果。最后,使用pd.DataFrame.reset_index
方法从生成的系列创建一个数据帧:
import pandas as pd
data_frame = pd.DataFrame({'from': [1, 1, 2], 'to': [3, 2, 3]})
occur = pd.concat([df['from'].value_counts(), df['to'].value_counts()])
result_df = occur.reset_index()
result_df.columns = ['UserID', 'occur']
result_df = result_df.groupby(['UserID'])['occur'].sum().reset_index()
UserID occur
0 1 2
1 2 2
2 3 2
英文:
The pd.Series.value
counts method may be used to count the instances of each userID
in the columns "from" and "to," and pd.concat
can be used to combine the results. At the end create a dataframe from the resulting series using the pd.DataFrame.reset index method:
import pandas as pd
data_frame = pd.DataFrame({'from': [1, 1, 2], 'to': [3, 2, 3]})
occur = pd.concat([df['from'].value_counts(), df['to'].value_counts()])
result_df = occur.reset_index()
result_df.columns = ['UserID', 'occur']
result_df = result_df.groupby(['UserID'])['occur'].sum().reset_index()
UserID Occur
0 1 2
1 2 2
2 3 2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论