英文:
Replace a value by that value divided by number of time that value existing in pandas
问题
我帮你翻译以下部分:
从上面的数据框中,我想要将价格(Price)等于10000的行替换为具有相同ID和价格等于10000的行数,这里的计数为4。
期望的输出:
      ID    Unit_ID       Price
        1     1             50
        2     2             40
        3     1             2500
        3     2             2500
        3     3             2500
        3     4             2500
        6     1             10000
        8     3             10000
英文:
I have dataframe as follows
ID    Unit_ID       Price
1     1             50
2     2             40
3     1             10000
3     2             10000
3     3             10000
3     4             10000
6     1             10000
8     3             10000
From the above dataframe I want to replace the Price = 10000
By the count of rows having same ID and Price = 10000, here that count = 4
Expected Output:
  ID    Unit_ID       Price
    1     1             50
    2     2             40
    3     1             2500
    3     2             2500
    3     3             2500
    3     4             2500
    6     1             10000
    8     3             10000
答案1
得分: 1
创建掩码并将过滤后的行除以True值的计数,使用sum:
mask = df.Price == 10000
df.loc[mask, 'Price'] /= mask.sum()
#print (df)
   ID  Unit_ID   Price
0   1        1    50.0
1   2        2    40.0
2   3        1  2500.0
3   3        2  2500.0
4   3        3  2500.0
5   3        4  2500.0
如果想要将所有值都除以它们的计数:
df['Price'] /= df.groupby(by="Price")['Price'].transform('size')
编辑后:
df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
#print (df)
   ID  Unit_ID    Price
0   1        1     50.0
1   2        2     40.0
2   3        1   2500.0
3   3        2   2500.0
4   3        3   2500.0
5   3        4   2500.0
6   6        1  10000.0
7   8        3  10000.0
英文:
Create mask and divide filtered rows by count of Trues values by sum:
mask = df.Price == 10000
df.loc[mask, 'Price'] /= mask.sum()
#same like
#df.loc[mask, 'Price'] = df.loc[mask, 'Price'] / mask.sum()
print (df)
   ID  Unit_ID   Price
0   1        1    50.0
1   2        2    40.0
2   3        1  2500.0
3   3        2  2500.0
4   3        3  2500.0
5   3        4  2500.0
If want to divide all values by their counts:
df['Price'] /= df.groupby(by="Price")['Price'].transform('size')
EDIT:
df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
print (df)
   ID  Unit_ID    Price
0   1        1     50.0
1   2        2     40.0
2   3        1   2500.0
3   3        2   2500.0
4   3        3   2500.0
5   3        4   2500.0
6   6        1  10000.0
7   8        3  10000.0
答案2
得分: 1
如果您只想将价格为10000的行替换为10000,可以这样做:
df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])
如果您想将每一行都除以该值的计数,可以使用groupby和transform:
df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))
	ID	Unit_ID	Price
0	1	1		50
1	2	2		40
2	3	1		2500
3	3	2		2500
4	3	3		2500
5	3	4		2500
英文:
If you just want to replace the rows with 10000, you can do:
df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])
If you want to divide every row with the value count, you can use groupby and transform:
df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))
	ID	Unit_ID	Price
0	1	1		50
1	2	2		40
2	3	1		2500
3	3	2		2500
4	3	3		2500
5	3	4		2500
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论