英文:
Get ranking within Pandas.DataFrame with ties as possibility
问题
我有以下示例
import pandas as pd
names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]
scores = pd.DataFrame({'name': names,
'points': points})
我想创建一个名为position
的新列,该列指定了玩家的相对位置。拥有最多积分的玩家是#1。
我使用以下代码对df
进行排序:
scores = scores.sort_values(by='points', ascending=False)
如果存在平局(相同的积分数),我希望position
是T
和相应的位置。在我的示例中,b
和c
的位置是T2
。
期望输出:
name points position
e 20 1
b 15 T2
c 15 T2
d 12 3
a 10 4
谢谢。
英文:
I have the following example
import pandas as pd
names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]
scores = pd.DataFrame({'name': names,
'points': points})
I want to create a new column called position
that specifies the relative position of a player. The player with the most points is #1.
I sort the df
using
scores = scores.sort_values(by='points', ascending=False)
If there is a tie (same number of points) I want position
to be the T
and the corresponding position.
In my example the position of b
and c
is T2
.
Desired output:
name points position
e 20 1
b 15 T2
c 15 T2
d 12 3
a 10 4
Thank you
答案1
得分: 1
# 是否有并列?
m = scores["points"].duplicated(keep=False)
# 计算排名
s = scores["points"].rank(method="dense", ascending=False)
scores["position"] = (
s.where(~m, s.astype(str).radd("T"))
.astype(str)
.replace(".0$", "", regex=True)
)
out = scores.sort_values(by="points", ascending=False)
# 输出:
print(out)
姓名 分数 排名
4 e 20 1
1 b 15 T2
2 c 15 T2
3 d 12 3
0 a 10 4
英文:
I would use pandas.Series.rank
:
# is there a tie ?
m = scores["points"].duplicated(keep=False)
# calculate rankings
s = scores["points"].rank(method="dense", ascending=False)
scores["position"] = (
s.where(~m, s.astype(str).radd("T"))
.astype(str)
.replace(".0$", "", regex=True)
)
out = scores.sort_values(by="points", ascending=False)
# Output :
print(out)
name points position
4 e 20 1
1 b 15 T2
2 c 15 T2
3 d 12 3
0 a 10 4
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论