英文:
Find mode in panda Dataframe
问题
在c1_ind和c2_ind中查找众数,我不想沿每列查找。
在c1_ind和c3_ind中查找众数。
英文:
Find mode among all values in c1_ind and c2_ind . I don't want to mode along each column.
import pandas as pd
import numpy as np
from scipy.stats import mode
list =[{"col1":123,"C1_IND ":"Rev","C2_IND":"Hold"},
{"col1":456,"C1_IND ":"Hold","C2_IND":"Rev"},
{"col1":123,"C1_IND ":"Hold","C2_IND":"Service"},
{"col1":1236,"C1_IND ":"Man","C2_IND":"Man"}]
df = pd.DataFrame.from_dict(list)
print(df)
For another example Find mode among all values in c1_ind and c3_ind
import pandas as pd
import numpy as np
from scipy.stats import mode
list =[{"col1":123,"C1_IND ":"Rev","C2_IND":"Hold","C3_IND":"Hold"},
{"col1":456,"C1_IND ":"Hold","C2_IND":"Rev","C3_IND":"Rev"},
{"col1":123,"C1_IND ":"Hold","C2_IND":"Service","C3_IND":"Service"},
{"col1":1236,"C1_IND ":"Man","C2_IND":"Man","C3_IND":"Man"}]
df = pd.DataFrame.from_dict(list)
print(df)
答案1
得分: 1
你可以使用 filter
来选择感兴趣的列(这里使用正则表达式),然后使用 stack
,最后使用 mode
。:
df.filter(regex='C\d+_IND').stack().mode().iloc[0]
注意:如果有多个众数,我们只保留一个。如果你想要全部,去掉 iloc[0]
,可选替换为 squeeze
。
输出: 'Hold'
英文:
You can use filter
to select the columns of interest (here by regex), then stack
and finally use mode
.:
df.filter(regex='C\d+_IND').stack().mode().iloc[0]
NB. If there are several modes, we only keep one. If you want all, remove the iloc[0]
, optionally replacing with squeeze
.
Output: 'Hold'
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论