英文:
How to calculate weighted average of all other products in python
问题
我需要创建一个新列,该列是所有子类别(在一个类别内)的加权平均价格(根据类别内的收入进行加权),除了子类别列中的那一个,即对于第一行,我只需要A2和A3的加权平均价格,因为子类别列中的值是A1。可以有人帮忙吗?
英文:
I have a dataframe as below
category sub category price revenue
A A1 100 1000
A A2 110 990
A A3 120 890
B B1 90 1200
B B2 100 1100
B B3 95 1050
I need to create a new column which is the weighted avg price(weighted with revenue within a category) for all subcategories(within a category) except the one in subcategory column,i.e for the 1st row, I need the weighted avg price of A2 & A3 only since A1 is the value in sub-category column. Can someone pls help?
答案1
得分: 1
你可以手动计算加权均值,同时减去自身的值:
tmp = (df.set_index(['category', 'sub category'])
.eval('prod=price*revenue')
)
g = tmp.groupby(level=0)
out = (g['prod'].transform('sum')
.sub(tmp['prod'])
.div(g['revenue'].transform('sum').sub(tmp['revenue']))
)
输出:
category sub category
A A1 114.734043
A2 109.417989
A3 104.974874
B B1 97.558140
B2 92.333333
B3 94.782609
dtype: float64
英文:
You can compute the weighted mean manually, while subtracting the self values:
tmp = (df.set_index(['category', 'sub category'])
.eval('prod=price*revenue')
)
g = tmp.groupby(level=0)
out = (g['prod'].transform('sum')
.sub(tmp['prod'])
.div(g['revenue'].transform('sum').sub(tmp['revenue']))
)
Output:
category sub category
A A1 114.734043
A2 109.417989
A3 104.974874
B B1 97.558140
B2 92.333333
B3 94.782609
dtype: float64
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论