将一个值替换为该值除以该值在 pandas 中存在的次数。

huangapple go评论107阅读模式
英文:

Replace a value by that value divided by number of time that value existing in pandas

问题

我帮你翻译以下部分:

从上面的数据框中,我想要将价格(Price)等于10000的行替换为具有相同ID和价格等于10000的行数,这里的计数为4。

期望的输出:

      ID    Unit_ID       Price
        1     1             50
        2     2             40
        3     1             2500
        3     2             2500
        3     3             2500
        3     4             2500
        6     1             10000
        8     3             10000
英文:

I have dataframe as follows

ID    Unit_ID       Price
1     1             50
2     2             40
3     1             10000
3     2             10000
3     3             10000
3     4             10000
6     1             10000
8     3             10000

From the above dataframe I want to replace the Price = 10000
By the count of rows having same ID and Price = 10000, here that count = 4

Expected Output:

  ID    Unit_ID       Price
    1     1             50
    2     2             40
    3     1             2500
    3     2             2500
    3     3             2500
    3     4             2500
    6     1             10000
    8     3             10000

答案1

得分: 1

创建掩码并将过滤后的行除以True值的计数,使用sum

mask = df.Price == 10000

df.loc[mask, 'Price'] /= mask.sum()
#print (df)
   ID  Unit_ID   Price
0   1        1    50.0
1   2        2    40.0
2   3        1  2500.0
3   3        2  2500.0
4   3        3  2500.0
5   3        4  2500.0

如果想要将所有值都除以它们的计数:

df['Price'] /= df.groupby(by="Price")['Price'].transform('size')

编辑后:

df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
#print (df)
   ID  Unit_ID    Price
0   1        1     50.0
1   2        2     40.0
2   3        1   2500.0
3   3        2   2500.0
4   3        3   2500.0
5   3        4   2500.0
6   6        1  10000.0
7   8        3  10000.0
英文:

Create mask and divide filtered rows by count of Trues values by sum:

mask = df.Price == 10000

df.loc[mask, 'Price'] /= mask.sum()
#same like
#df.loc[mask, 'Price'] = df.loc[mask, 'Price'] / mask.sum()
print (df)
   ID  Unit_ID   Price
0   1        1    50.0
1   2        2    40.0
2   3        1  2500.0
3   3        2  2500.0
4   3        3  2500.0
5   3        4  2500.0

If want to divide all values by their counts:

df['Price'] /= df.groupby(by="Price")['Price'].transform('size')

EDIT:

df['Price'] /= df.groupby(by=["ID", "Price"])['Price'].transform('size')
print (df)
   ID  Unit_ID    Price
0   1        1     50.0
1   2        2     40.0
2   3        1   2500.0
3   3        2   2500.0
4   3        3   2500.0
5   3        4   2500.0
6   6        1  10000.0
7   8        3  10000.0

答案2

得分: 1

如果您只想将价格为10000的行替换为10000,可以这样做:

df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])

如果您想将每一行都除以该值的计数,可以使用groupby和transform:

df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))
	ID	Unit_ID	Price
0	1	1		50
1	2	2		40
2	3	1		2500
3	3	2		2500
4	3	3		2500
5	3	4		2500
英文:

If you just want to replace the rows with 10000, you can do:

df.loc[df.Price==10000, 'Price']=10000/len(df.loc[df.Price==10000])

If you want to divide every row with the value count, you can use groupby and transform:

df.Price = df.groupby(by="Price").Price.transform(lambda x: x/len(x))


	ID	Unit_ID	Price
0	1	1		50
1	2	2		40
2	3	1		2500
3	3	2		2500
4	3	3		2500
5	3	4		2500

huangapple
  • 本文由 发表于 2020年1月6日 19:19:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/59611181.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定