2023年1月8日 22:01:01go评论109阅读模式

英文:

Get ranking within Pandas.DataFrame with ties as possibility

问题

我有以下示例

import pandas as pd
names = ['a', 'b', 'c', 'd', 'e']
points = [10, 15, 15, 12, 20]
scores = pd.DataFrame({'name': names,
                       'points': points})

我想创建一个名为position的新列，该列指定了玩家的相对位置。拥有最多积分的玩家是#1。

我使用以下代码对df进行排序：

scores = scores.sort_values(by='points', ascending=False)

如果存在平局（相同的积分数），我希望position是T和相应的位置。在我的示例中，b和c的位置是T2。

期望输出：

  name  points position
    e      20        1
    b      15       T2
    c      15       T2
    d      12        3
    a      10        4

谢谢。

英文:

I have the following example

import pandas as pd
names = [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;]
points = [10, 15, 15, 12, 20]
scores = pd.DataFrame({&#39;name&#39;: names,
                       &#39;points&#39;: points})

I want to create a new column called position that specifies the relative position of a player. The player with the most points is #1.

I sort the df using

scores = scores.sort_values(by=&#39;points&#39;, ascending=False)

If there is a tie (same number of points) I want position to be the T and the corresponding position.
In my example the position of b and c is T2.

Desired output:

  name  points  position
     e      20         1
     b      15        T2
     c      15        T2
     d      12         3
     a      10         4

Thank you

答案1

得分: 1

# 是否有并列？
m = scores["points"].duplicated(keep=False)
# 计算排名
s = scores["points"].rank(method="dense", ascending=False)
scores["position"] = (
                        s.where(~m, s.astype(str).radd("T"))
                            .astype(str)
                            .replace(".0$", "", regex=True)
                     )
out = scores.sort_values(by="points", ascending=False)
# 输出：
print(out)
  姓名  分数  排名
4   e  20   1
1   b  15  T2
2   c  15  T2
3   d  12   3
0   a  10   4

英文:

I would use pandas.Series.rank :

# is there a tie ?
m = scores[&quot;points&quot;].duplicated(keep=False)
# calculate rankings
s = scores[&quot;points&quot;].rank(method=&quot;dense&quot;, ascending=False)
scores[&quot;position&quot;] = (
                        s.where(~m, s.astype(str).radd(&quot;T&quot;))
                            .astype(str)
                            .replace(&quot;.0$&quot;, &quot;&quot;, regex=True)
                     )
out = scores.sort_values(by=&quot;points&quot;, ascending=False)

# Output :

print(out)
  name  points position
4    e      20        1
1    b      15       T2
2    c      15       T2
3    d      12        3
0    a      10        4

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Pandas.DataFrame中获取排名，包括可能存在的并列排名。

问题

答案1

# Output :

将使用file_selector上传的内容使其在main.py中可用的推荐方法是什么？

如何从谷歌云控制台项目中获取正确的JSON凭据文件？

将字节以 JSON 格式发送到 Python 套接字编程中的服务器。

使用pyspark基于字典映射以高效方式替换多列的值。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。