2023年5月24日 22:28:27go评论106阅读模式

英文:

How to sort a list of dictionaries by a list that can contain duplicate values?

问题

在Python 3.9中，对大多数对象的列表按照第二个列表进行排序很容易，即使存在重复项：

>>> sorted(zip([5, 5, 3, 2, 1], ['z', 'y', 'x', 'w', 'x']))
[(1, 'x'), (2, 'w'), (3, 'x'), (5, 'y'), (5, 'z')]

但是，如果要排序的列表包含字典，通常情况下排序会出现问题：

>>> sorted(zip([3, 2, 1], [{'z': 1}, {'y': 2}, {'x': 3}])) 
[(1, {'x': 3}), (2, {'y': 2}), (3, {'z': 1})]

然而，当要排序的列表包含重复项时，会出现以下错误：

>>> sorted(zip([5, 5, 3, 2, 1], [{'z': 1}, {'y': 2}, {'x': 3}, {'w': 4}, {'u': 5}])) 
*** TypeError: '<' not supported between instances of 'dict' and 'dict'

这个问题对我来说看起来相当奇怪：要排序的列表的值如何影响要排序的列表呢？

另一种不太优雅的解决方案是通过索引从列表中获取字典对象：

>>> sort_list = [5, 5, 3, 2, 1]    
>>> dicts_list = [{'z': 1}, {'y': 2}, {'x': 3}, {'w': 4}, {'u': 5}]
>>> [dicts_list[i] for _, i in sorted(zip(sort_list, range(len(sort_list))))]
[{'u': 5}, {'w': 4}, {'x': 3}, {'z': 1}, {'y': 2}]

与此相关的问题已经在StackOverflow上提出了很多，涉及到：

这种特殊情况，特别是包含重复项的情况，尚未讨论过。

英文:

Context:
In Python 3.9 sorting a list of most objects by a second list is easy, even if duplicates are present:

&gt;&gt;&gt; sorted(zip([5, 5, 3, 2, 1], [&#39;z&#39;, &#39;y&#39;, &#39;x&#39;, &#39;w&#39;, &#39;x&#39;]))
[(1, &#39;x&#39;), (2, &#39;w&#39;), (3, &#39;x&#39;), (5, &#39;y&#39;), (5, &#39;z&#39;)]

If this list to be sorted contains dictionaries, sorting generally goes fine:

&gt;&gt;&gt; sorted(zip([3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}])) 
[(1, {&#39;x&#39;: 3}), (2, {&#39;y&#39;: 2}), (3, {&#39;z&#39;: 1})]

Issue:
However, when the list to be sorted by contains duplicates, the following error occurs:

&gt;&gt;&gt;sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}])) 
*** TypeError: &#39;&lt;&#39; not supported between instances of &#39;dict&#39; and &#39;dict&#39;

The issue sees pretty crazy to me: How do the values of the list to be sorted even affect the list to be sorted by?

Alternative solution:
One, not so elegant solution would be to get the dictionary objects from the list by index:

&gt;&gt;&gt; sort_list = [5, 5, 3, 2, 1]    
&gt;&gt;&gt; dicts_list = [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]
&gt;&gt;&gt; [dicts_list[i] for _, i in sorted(zip(sort_list, range(len(sort_list))))]
 [{&#39;u&#39;: 5}, {&#39;w&#39;: 4}, {&#39;x&#39;: 3}, {&#39;z&#39;: 1}, {&#39;y&#39;: 2}]

Related Questions on StackOverflow:
Many similar questions have been raised on StackOverflow, related to

This specific case, especially including duplicates has not been discussed yet.

答案1

得分: 5

这是由于元组比较的方式。通常情况下，如果t1和t2是元组，那么当t1[0] < t2[0]时，t1 < t2，但是如果t1[0] == t2[0]，那么它将比较t1[1] < t2[1]，依此类推。这意味着即使元组中包含无法相互比较的项，只要比较可以区分前面的项，你仍然可以比较它们。

解决这个问题的标准方法是提供一个忽略字典的键函数：

sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w': 4}, {'u': 5}]), key=lambda item: item[0])

英文:

The reason for this is the way tuple comparison works. Generally, if t1 and t2 are tuples, t1 < t2 if t1[0] < t2[0], however if t1[0] == t2[0], then it will compare t1[1] < t2[1] and so on. This means you can compare tuples even if they contain items that can't be compared to each other, as long as the comparison can differentiate items that come before it.

The standard way to solve this is to provide a key function that ignores the dictionary:

sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]), key=lambda item: item[0])

答案2

得分: 1

在你的第二个示例中：

sorted(zip([3, 2, 1], [{'z':1}, {'y':2}, {'x':3}]))

排序永远不需要在比较中使用字典，因为由zip生成的元组的第一个项目永远不相等。

在第三个示例中：

sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w':4}, {'u':5}]))

以5开头的两个配对元组需要使用它们的第二个项目（字典）进行比较，由于字典不能进行比较以确定哪一个更大/更小，排序失败。

为了使其工作，您需要提供一个键函数或lambda函数，以帮助排序确定您希望如何排序字典。可以有很多不同的方法，比如仅比较键、比较最大值或键，或其他一些约定（这取决于您，排序不知道）。

英文:

In your second example:

sorted(zip([3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}]))

The sort never has to use the dictionaries in the comparison because the first item of the tuples produced by zip are never equal.

In the third example:

sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]))

The two zipped tuples that start with 5 need to use their second items (the dictionaries) in the comparison and, since dictionaries cannot be compared to determine which one is greater/smaller, the sort fails.

In order for this to work, you would have to provide a key function or lambda to help the sort determine how you want your dictionaries to be ordered. It could be many different ways such as comparing only on keys, on the largest value or key or some other convention (that's up to you, sort doesn't know)

答案3

得分: 0

使用itemgetter而不是lambda函数返回每个元组的第一个元素（忽略字典）。

from operator import itemgetter
sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w': 4}, {'u': 5}]),key=itemgetter(0))

英文:

A variant of the @Jasmin answer using itemgetter instead of a lambda function to return the first element of each tuple (so ignoring the dictionnaries)

from operator import itemgetter
sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]),key=itemgetter(0))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何按照一个可能包含重复值的列表对字典列表进行排序？

问题

答案1

答案2

答案3

声音识别（将声音转换为文本）

从Python中的列表中删除子列表中的元素

如何修复“Building wheel for lxml”问题？

将JSON列表转换为字典

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。