英文:
How to check in which bin an entry of a pandas data frame group belongs?
问题
我明白你想要的是如何创建一个quantile_bin
列,该列包含整数值从1到10,用于指示特定Player
的wins
列中的数字属于哪个分位数。其中1代表在0到0.1分位数之间,2代表在0.1到0.2分位数之间,以此类推,10代表在0.9到1.0分位数之间。
你可以通过以下方式定义magic_quantile_function
来实现这一目标:
import pandas as pd
import numpy as np
def magic_quantile_function(x):
quantiles = np.arange(0.1, 1.1, 0.1) # 定义分位数的分割点
bins = pd.qcut(x, quantiles, labels=False, duplicates='drop') + 1
return bins
example = {'Player': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B',
'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
'wins': [2, 4, 4, 5, 6, 9, 13, 13, 15, 15, 14, 13,
11, 9, 9, 8, 8, 16, 19, 21, 14, 20, 19, 18]}
df = pd.DataFrame(example)
df['quantile_bin'] = df.groupby("Player")["wins"].transform(magic_quantile_function)
print(df)
这段代码首先定义了magic_quantile_function
,它使用pd.qcut
函数来计算每个Player
的wins
列所属的分位数,并将结果存储在quantile_bin
列中。然后,使用df.groupby("Player")["wins"].transform
将这个函数应用到每个分组中,最终得到你想要的结果。
输出的结果应该类似于你所期望的那样。
英文:
I have a following problem.
I have a pandas data frame, eg something like this:
example = {'Player': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B',
'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
'wins': [2, 4, 4, 5, 6, 9, 13, 13, 15, 15, 14, 13,
11, 9, 9, 8, 8, 16, 19, 21, 14, 20, 19, 18]}
df = pd.DataFrame(example)
What I want to build, is a column quantile_bin
, which has integer values from 1 to 10, which tells me, to which quantile the specific number in wins
for a specific Player
belongs, where 1 is between 0 and 0.1 quantile, 2 is between 0.1 and 0.2 quantile, ..., and 10 is between 0.9 and 1.0 quantile.
In other words
df.groupby("Player")["wins"].apply(magic_quantile_function)
The result should be something like
>>> df
Player wins quantile_bin
0 A 2 1
1 A 4 2
2 A 4 2
3 A 5 5
4 A 6 6
5 A 9 8
6 A 13 10
7 A 13 10
8 B 15 10
9 B 15 10
10 B 14 9
11 B 13 6
12 B 11 5
13 B 9 2
14 B 9 2
15 B 8 1
16 C 8 1
17 C 16 3
18 C 19 7
19 C 21 10
20 C 14 2
21 C 20 9
22 C 19 7
23 C 18 5
I would be grateful if one could help me defining this magic_quantile_function
, or at least giving me a hint on how to pass a grouped wins
distribution to a user defined function which is applied to a groupby object.
Thank you in advance!
答案1
得分: 1
看起来你可以使用qcut
df['quantile'] = df.groupby('Player')['wins'].transform(lambda s: pd.qcut(s, q=10, duplicates='drop', labels=range(2, 11)))
Player wins quantile
0 A 2 2
1 A 4 3
2 A 4 3
3 A 5 6
4 A 6 7
5 A 9 9
6 A 13 10
7 A 13 10
8 B 15 10
9 B 15 10
10 B 14 9
11 B 13 7
12 B 11 6
13 B 9 3
14 B 9 3
15 B 8 2
16 C 8 2
17 C 16 4
18 C 19 7
19 C 21 10
20 C 14 3
21 C 20 9
22 C 19 7
23 C 18 6
英文:
Seems like you could use qcut
df['quantile'] = df.groupby('Player')['wins'].transform(lambda s: pd.qcut(s,
q=10,
duplicates='drop',
labels=range(2, 11)))
Player wins quantile
0 A 2 2
1 A 4 3
2 A 4 3
3 A 5 6
4 A 6 7
5 A 9 9
6 A 13 10
7 A 13 10
8 B 15 10
9 B 15 10
10 B 14 9
11 B 13 7
12 B 11 6
13 B 9 3
14 B 9 3
15 B 8 2
16 C 8 2
17 C 16 4
18 C 19 7
19 C 21 10
20 C 14 3
21 C 20 9
22 C 19 7
23 C 18 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论