在数据框中每列的出现次数。

huangapple go评论62阅读模式
英文:

Count number of occurences in Dataframe per column

问题

用户ID 出现次数
1 2
2 2
3 2
英文:

I have a sample dataframe whereby all numbers are userID:

from to
1 3
1 2
2 3

How do I count the number of occurrences for each columns, sum it up based on the same values and displays in the following format in a new dataframe?

UserID Occurences
1 2
2 2
3 2

Thank you.

答案1

得分: 2

IIUC,您可以执行 stack 然后 value_counts

out = (df.stack().value_counts()
       .to_frame('Occurrences')
       .rename_axis('UserID')
       .reset_index())
print(out)

   UserID  Occurrences
0       1           2
1       2           2
2       3           2
英文:

IIUC, you can stack then value_counts

out = (df.stack().value_counts()
       .to_frame('Occurences')
       .rename_axis('UserID')
       .reset_index())
print(out)

   UserID  Occurences
0       1           2
1       2           2
2       3           2

答案2

得分: 1

使用 DataFrame.meltGroupBy.size:

df = df.melt(value_name='UserID').groupby('UserID').size().reset_index(name='Occurences')
print(df)
   UserID  Occurences
0       1           2
1       2           2
2       3           2
英文:

Use DataFrame.melt with GroupBy.size:

df = df.melt(value_name='UserID').groupby('UserID').size().reset_index(name='Occurences')
print (df)
   UserID  Occurences
0       1           2
1       2           2
2       3           2

答案3

得分: 0

The pd.Series.value_counts 方法可用于计算“from”和“to”列中每个“userID”的实例数量,pd.concat 可用于合并结果。最后,使用pd.DataFrame.reset_index 方法从生成的系列创建一个数据帧:

import pandas as pd
data_frame = pd.DataFrame({'from': [1, 1, 2], 'to': [3, 2, 3]})

occur = pd.concat([df['from'].value_counts(), df['to'].value_counts()])
result_df = occur.reset_index()
result_df.columns = ['UserID', 'occur']
result_df = result_df.groupby(['UserID'])['occur'].sum().reset_index()

   UserID  occur
0       1      2
1       2      2
2       3      2
英文:

The pd.Series.value counts method may be used to count the instances of each userID in the columns "from" and "to," and pd.concat can be used to combine the results. At the end create a dataframe from the resulting series using the pd.DataFrame.reset index method:

import pandas as pd
data_frame = pd.DataFrame({'from': [1, 1, 2], 'to': [3, 2, 3]})

occur = pd.concat([df['from'].value_counts(), df['to'].value_counts()])
result_df = occur.reset_index()
result_df.columns = ['UserID', 'occur']
result_df = result_df.groupby(['UserID'])['occur'].sum().reset_index()

   UserID         Occur
0       1           2
1       2           2
2       3           2

huangapple
  • 本文由 发表于 2023年2月6日 19:05:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360501.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定