在Pandas.DataFrame中获取排名,包括可能存在的并列排名。

huangapple go评论98阅读模式
英文:

Get ranking within Pandas.DataFrame with ties as possibility

问题

我有以下示例

  1. import pandas as pd
  2. names = ['a', 'b', 'c', 'd', 'e']
  3. points = [10, 15, 15, 12, 20]
  4. scores = pd.DataFrame({'name': names,
  5. 'points': points})

我想创建一个名为position的新列,该列指定了玩家的相对位置。拥有最多积分的玩家是#1。

我使用以下代码对df进行排序:

  1. scores = scores.sort_values(by='points', ascending=False)

如果存在平局(相同的积分数),我希望positionT和相应的位置。在我的示例中,bc的位置是T2

期望输出:

  1. name points position
  2. e 20 1
  3. b 15 T2
  4. c 15 T2
  5. d 12 3
  6. a 10 4

谢谢。

英文:

I have the following example

  1. import pandas as pd
  2. names = ['a', 'b', 'c', 'd', 'e']
  3. points = [10, 15, 15, 12, 20]
  4. scores = pd.DataFrame({'name': names,
  5. 'points': points})

I want to create a new column called position that specifies the relative position of a player. The player with the most points is #1.

I sort the df using

  1. scores = scores.sort_values(by='points', ascending=False)

If there is a tie (same number of points) I want position to be the T and the corresponding position.
In my example the position of b and c is T2.

Desired output:

  1. name points position
  2. e 20 1
  3. b 15 T2
  4. c 15 T2
  5. d 12 3
  6. a 10 4

Thank you

答案1

得分: 1

  1. # 是否有并列?
  2. m = scores["points"].duplicated(keep=False)
  3. # 计算排名
  4. s = scores["points"].rank(method="dense", ascending=False)
  5. scores["position"] = (
  6. s.where(~m, s.astype(str).radd("T"))
  7. .astype(str)
  8. .replace(".0$", "", regex=True)
  9. )
  10. out = scores.sort_values(by="points", ascending=False)
  11. # 输出:
  12. print(out)
  13. 姓名 分数 排名
  14. 4 e 20 1
  15. 1 b 15 T2
  16. 2 c 15 T2
  17. 3 d 12 3
  18. 0 a 10 4
英文:

I would use pandas.Series.rank :

  1. # is there a tie ?
  2. m = scores["points"].duplicated(keep=False)
  3. # calculate rankings
  4. s = scores["points"].rank(method="dense", ascending=False)
  5. scores["position"] = (
  6. s.where(~m, s.astype(str).radd("T"))
  7. .astype(str)
  8. .replace(".0$", "", regex=True)
  9. )
  10. out = scores.sort_values(by="points", ascending=False)

# Output :

  1. print(out)
  2. name points position
  3. 4 e 20 1
  4. 1 b 15 T2
  5. 2 c 15 T2
  6. 3 d 12 3
  7. 0 a 10 4

huangapple
  • 本文由 发表于 2023年1月8日 22:01:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75048344.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定