英文:
Replace a value by that value divided by number of time that value existing in pandas
问题
我帮你翻译以下部分:
从上面的数据框中,我想要将价格(Price)等于10000的行替换为具有相同ID和价格等于10000的行数,这里的计数为4。
期望的输出:
ID Unit_ID Price
1 1 50
2 2 40
3 1 2500
3 2 2500
3 3 2500
3 4 2500
6 1 10000
8 3 10000
英文:
I have dataframe as follows
ID Unit_ID Price
1 1 50
2 2 40
3 1 10000
3 2 10000
3 3 10000
3 4 10000
6 1 10000
8 3 10000
From the above dataframe I want to replace the Price = 10000
By the count of rows having same ID and Price = 10000, here that count = 4
Expected Output:
ID Unit_ID Price
1 1 50
2 2 40
3 1 2500
3 2 2500
3 3 2500
3 4 2500
6 1 10000
8 3 10000
答案1
得分: 1
创建掩码并将过滤后的行除以True
值的计数,使用sum
:
mask = df.Price == 10000
df.loc[mask, 'Price'] /= mask.sum()
#print (df)
ID Unit_ID Price
0 1 1 50.0
1 2 2 40.0
2 3 1 2500.0
3 3 2 2500.0
4 3 3 2500.0
5 3 4 2500.0
如果想要将所有值都除以它们的计数:
df['Price'] /= df.groupby(by="Price")['Price'].transform('size')
编辑后:
df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
#print (df)
ID Unit_ID Price
0 1 1 50.0
1 2 2 40.0
2 3 1 2500.0
3 3 2 2500.0
4 3 3 2500.0
5 3 4 2500.0
6 6 1 10000.0
7 8 3 10000.0
英文:
Create mask and divide filtered rows by count of True
s values by sum
:
mask = df.Price == 10000
df.loc[mask, 'Price'] /= mask.sum()
#same like
#df.loc[mask, 'Price'] = df.loc[mask, 'Price'] / mask.sum()
print (df)
ID Unit_ID Price
0 1 1 50.0
1 2 2 40.0
2 3 1 2500.0
3 3 2 2500.0
4 3 3 2500.0
5 3 4 2500.0
If want to divide all values by their counts:
df['Price'] /= df.groupby(by="Price")['Price'].transform('size')
EDIT:
df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
print (df)
ID Unit_ID Price
0 1 1 50.0
1 2 2 40.0
2 3 1 2500.0
3 3 2 2500.0
4 3 3 2500.0
5 3 4 2500.0
6 6 1 10000.0
7 8 3 10000.0
答案2
得分: 1
如果您只想将价格为10000的行替换为10000,可以这样做:
df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])
如果您想将每一行都除以该值的计数,可以使用groupby和transform:
df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))
ID Unit_ID Price
0 1 1 50
1 2 2 40
2 3 1 2500
3 3 2 2500
4 3 3 2500
5 3 4 2500
英文:
If you just want to replace the rows with 10000, you can do:
df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])
If you want to divide every row with the value count, you can use groupby and transform:
df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))
ID Unit_ID Price
0 1 1 50
1 2 2 40
2 3 1 2500
3 3 2 2500
4 3 3 2500
5 3 4 2500
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论