如何按照一个可能包含重复值的列表对字典列表进行排序?

huangapple go评论76阅读模式
英文:

How to sort a list of dictionaries by a list that can contain duplicate values?

问题

在Python 3.9中,对大多数对象的列表按照第二个列表进行排序很容易,即使存在重复项:

>>> sorted(zip([5, 5, 3, 2, 1], ['z', 'y', 'x', 'w', 'x']))
[(1, 'x'), (2, 'w'), (3, 'x'), (5, 'y'), (5, 'z')]

但是,如果要排序的列表包含字典,通常情况下排序会出现问题:

>>> sorted(zip([3, 2, 1], [{'z': 1}, {'y': 2}, {'x': 3}])) 
[(1, {'x': 3}), (2, {'y': 2}), (3, {'z': 1})]

然而,当要排序的列表包含重复项时,会出现以下错误:

>>> sorted(zip([5, 5, 3, 2, 1], [{'z': 1}, {'y': 2}, {'x': 3}, {'w': 4}, {'u': 5}])) 
*** TypeError: '<' not supported between instances of 'dict' and 'dict'

这个问题对我来说看起来相当奇怪:要排序的列表的值如何影响要排序的列表呢?

另一种不太优雅的解决方案是通过索引从列表中获取字典对象:

>>> sort_list = [5, 5, 3, 2, 1]    
>>> dicts_list = [{'z': 1}, {'y': 2}, {'x': 3}, {'w': 4}, {'u': 5}]
>>> [dicts_list[i] for _, i in sorted(zip(sort_list, range(len(sort_list))))]
[{'u': 5}, {'w': 4}, {'x': 3}, {'z': 1}, {'y': 2}]

与此相关的问题已经在StackOverflow上提出了很多,涉及到:

这种特殊情况,特别是包含重复项的情况,尚未讨论过。

英文:

Context:
In Python 3.9 sorting a list of most objects by a second list is easy, even if duplicates are present:

&gt;&gt;&gt; sorted(zip([5, 5, 3, 2, 1], [&#39;z&#39;, &#39;y&#39;, &#39;x&#39;, &#39;w&#39;, &#39;x&#39;]))
[(1, &#39;x&#39;), (2, &#39;w&#39;), (3, &#39;x&#39;), (5, &#39;y&#39;), (5, &#39;z&#39;)]

If this list to be sorted contains dictionaries, sorting generally goes fine:

&gt;&gt;&gt; sorted(zip([3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}])) 
[(1, {&#39;x&#39;: 3}), (2, {&#39;y&#39;: 2}), (3, {&#39;z&#39;: 1})]

Issue:
However, when the list to be sorted by contains duplicates, the following error occurs:

&gt;&gt;&gt;sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}])) 
*** TypeError: &#39;&lt;&#39; not supported between instances of &#39;dict&#39; and &#39;dict&#39;

The issue sees pretty crazy to me: How do the values of the list to be sorted even affect the list to be sorted by?

Alternative solution:
One, not so elegant solution would be to get the dictionary objects from the list by index:

&gt;&gt;&gt; sort_list = [5, 5, 3, 2, 1]    
&gt;&gt;&gt; dicts_list = [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]
&gt;&gt;&gt; [dicts_list[i] for _, i in sorted(zip(sort_list, range(len(sort_list))))]
 [{&#39;u&#39;: 5}, {&#39;w&#39;: 4}, {&#39;x&#39;: 3}, {&#39;z&#39;: 1}, {&#39;y&#39;: 2}]

Related Questions on StackOverflow:
Many similar questions have been raised on StackOverflow, related to

This specific case, especially including duplicates has not been discussed yet.

答案1

得分: 5

这是由于元组比较的方式。通常情况下,如果t1t2是元组,那么当t1[0] &lt; t2[0]时,t1 &lt; t2,但是如果t1[0] == t2[0],那么它将比较t1[1] &lt; t2[1],依此类推。这意味着即使元组中包含无法相互比较的项,只要比较可以区分前面的项,你仍然可以比较它们。

解决这个问题的标准方法是提供一个忽略字典的键函数:

sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w': 4}, {'u': 5}]), key=lambda item: item[0])
英文:

The reason for this is the way tuple comparison works. Generally, if t1 and t2 are tuples, t1 &lt; t2 if t1[0] &lt; t2[0], however if t1[0] == t2[0], then it will compare t1[1] &lt; t2[1] and so on. This means you can compare tuples even if they contain items that can't be compared to each other, as long as the comparison can differentiate items that come before it.

The standard way to solve this is to provide a key function that ignores the dictionary:

sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]), key=lambda item: item[0])

答案2

得分: 1

在你的第二个示例中:

sorted(zip([3, 2, 1], [{'z':1}, {'y':2}, {'x':3}]))

排序永远不需要在比较中使用字典,因为由zip生成的元组的第一个项目永远不相等。

在第三个示例中:

sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w':4}, {'u':5}]))

以5开头的两个配对元组需要使用它们的第二个项目(字典)进行比较,由于字典不能进行比较以确定哪一个更大/更小,排序失败。

为了使其工作,您需要提供一个键函数或lambda函数,以帮助排序确定您希望如何排序字典。可以有很多不同的方法,比如仅比较键、比较最大值或键,或其他一些约定(这取决于您,排序不知道)。

英文:

In your second example:

sorted(zip([3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}]))

The sort never has to use the dictionaries in the comparison because the first item of the tuples produced by zip are never equal.

In the third example:

sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]))

The two zipped tuples that start with 5 need to use their second items (the dictionaries) in the comparison and, since dictionaries cannot be compared to determine which one is greater/smaller, the sort fails.

In order for this to work, you would have to provide a key function or lambda to help the sort determine how you want your dictionaries to be ordered. It could be many different ways such as comparing only on keys, on the largest value or key or some other convention (that's up to you, sort doesn't know)

答案3

得分: 0

使用itemgetter而不是lambda函数返回每个元组的第一个元素(忽略字典)。

from operator import itemgetter
sorted(zip([5, 5, 3, 2, 1], [{'z':1}, {'y':2}, {'x':3}, {'w': 4}, {'u': 5}]),key=itemgetter(0))
英文:

A variant of the @Jasmin answer using itemgetter instead of a lambda function to return the first element of each tuple (so ignoring the dictionnaries)

from operator import itemgetter
sorted(zip([5, 5, 3, 2, 1], [{&#39;z&#39;:1}, {&#39;y&#39;:2}, {&#39;x&#39;:3}, {&#39;w&#39;: 4}, {&#39;u&#39;: 5}]),key=itemgetter(0)) 

huangapple
  • 本文由 发表于 2023年5月24日 22:28:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76324628.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定