在Pandas.DataFrame中获取排名,包括可能存在的并列排名。

huangapple go评论71阅读模式
英文:

Get ranking within Pandas.DataFrame with ties as possibility

问题

我有以下示例

import pandas as pd

names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]

scores = pd.DataFrame({'name': names,
                       'points': points})

我想创建一个名为position的新列,该列指定了玩家的相对位置。拥有最多积分的玩家是#1。

我使用以下代码对df进行排序:

scores = scores.sort_values(by='points', ascending=False)

如果存在平局(相同的积分数),我希望positionT和相应的位置。在我的示例中,bc的位置是T2

期望输出:

  name  points position
    e      20        1
    b      15       T2
    c      15       T2
    d      12        3
    a      10        4

谢谢。

英文:

I have the following example

import pandas as pd

names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]

scores = pd.DataFrame({'name': names,
                       'points': points})

I want to create a new column called position that specifies the relative position of a player. The player with the most points is #1.

I sort the df using

scores = scores.sort_values(by='points', ascending=False)

If there is a tie (same number of points) I want position to be the T and the corresponding position.
In my example the position of b and c is T2.

Desired output:

  name  points  position
     e      20         1
     b      15        T2
     c      15        T2
     d      12         3
     a      10         4

Thank you

答案1

得分: 1

# 是否有并列?
m = scores["points"].duplicated(keep=False)

# 计算排名
s = scores["points"].rank(method="dense", ascending=False)

scores["position"] = (
                        s.where(~m, s.astype(str).radd("T"))
                            .astype(str)
                            .replace(".0$", "", regex=True)
                     )

out = scores.sort_values(by="points", ascending=False)

# 输出:
print(out)

  姓名  分数  排名
4   e  20   1
1   b  15  T2
2   c  15  T2
3   d  12   3
0   a  10   4
英文:

I would use pandas.Series.rank :

# is there a tie ?
m = scores["points"].duplicated(keep=False)
​
# calculate rankings
s = scores["points"].rank(method="dense", ascending=False)
​
scores["position"] = (
                        s.where(~m, s.astype(str).radd("T"))
                            .astype(str)
                            .replace(".0$", "", regex=True)
                     )
​
out = scores.sort_values(by="points", ascending=False)

# Output :

print(out)

  name  points position
4    e      20        1
1    b      15       T2
2    c      15       T2
3    d      12        3
0    a      10        4

huangapple
  • 本文由 发表于 2023年1月8日 22:01:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75048344.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定