2013年10月1日 22:33:26go评论197阅读模式

英文:

Sort list of dictionaries based on values with highest similarity

问题

给定以下的Python字典列表：

results = [[{'id': '001', 'result': [0,0,0,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [0,0,0,1,1]}],
          [{'id': '001', 'result': [1,0,1,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [1,0,1,0,1]}]
            ]

我想根据'result'的值生成一个新的排序列表（使用Python和Golang），通过比较每个组中的玩家（'id'）之间的结果，然后根据匹配条目的数量进行排序（None结果被丢弃并且不计数）：

在第一轮和第二轮中，001和006有九个匹配的答案：
001 = [0,0,0,0,1] 006 = [0,0,0,1,1] - 四个匹配的答案。
在第二轮中，001和006有五个匹配的答案：
001 = [1,0,1,0,1] 006 = [1,0,1,0,1] - 五个匹配的答案

sorted_results = ['001','006','002','005','003','004']

'001'和'006'是列表中的前两个项目，因为它们具有最高数量的匹配结果 - 九个。

英文:

Given the following python list of dictionaries:

results = [[{&#39;id&#39;: &#39;001&#39;, &#39;result&#39;: [0,0,0,0,1]},
           {&#39;id&#39;: &#39;002&#39;, &#39;result&#39;: [1,1,1,1,1]},
           {&#39;id&#39;: &#39;003&#39;, &#39;result&#39;: [0,1,1,None,None]},
           {&#39;id&#39;: &#39;004&#39;, &#39;result&#39;: [0,None,None,1,0]},
           {&#39;id&#39;: &#39;005&#39;, &#39;result&#39;: [1,0,None,1,1]},
           {&#39;id&#39;: &#39;006&#39;, &#39;result&#39;: [0,0,0,1,1]}],
          [{&#39;id&#39;: &#39;001&#39;, &#39;result&#39;: [1,0,1,0,1]},
           {&#39;id&#39;: &#39;002&#39;, &#39;result&#39;: [1,1,1,1,1]},
           {&#39;id&#39;: &#39;003&#39;, &#39;result&#39;: [0,1,1,None,None]},
           {&#39;id&#39;: &#39;004&#39;, &#39;result&#39;: [0,None,None,1,0]},
           {&#39;id&#39;: &#39;005&#39;, &#39;result&#39;: [1,0,None,1,1]},
           {&#39;id&#39;: &#39;006&#39;, &#39;result&#39;: [1,0,1,0,1]}]
            ]

I would like to generate a new sorted list (in both python and golang) based on the values of 'result' by comparing results between the players ('id') in each group and then sorting them based on the number of matching entries (None results are discarded and not counted): <br>

During the first round and second round 001 and 006 had nine matching answers:<br>
001 = [0,0,0,0,1] 006 = [0,0,0,1,1] - four matching answers.<br>
During the second round, 001 and 006 had five matching answers:<br>
001 = [1,0,1,0,1] 006 = [1,0,1,0,1] - five matching answers

sorted_results = [&#39;001&#39;,&#39;006&#39;,&#39;002&#39;,&#39;005&#39;,&#39;003&#39;,&#39;004&#39;]

'001' and '006' are the first two items in the list because they have the highest number of matching results - nine.

答案1

得分: 1

如果按照“相同结果最多的数量”对这些项目进行排序，得到的结果如下：

['003', '004', '005', '006', '001', '002']

如果你的意思不是“相同结果最多的数量”，请澄清你的问题。另外，你可以简单修改max_identical函数，使其根据你对相似的定义进行操作。

上述结果是通过以下代码计算得出的：

from collections import defaultdict
 
 
results = [{'id': '001', 'result': [0, 0, 0, 0, 1]},
           {'id': '002', 'result': [1, 1, 1, 1, 1]},
           {'id': '003', 'result': [0, 1, 1, None, None]},
           {'id': '004', 'result': [0, None, None, 1, 0]},
           {'id': '005', 'result': [1, 0, None, 1, 1]},
           {'id': '006', 'result': [0, 0, 0, 1, 1]}]
 
 
def max_identical(lst):
    counts = defaultdict(lambda: 0)
    for x in lst:
        if x is not None:
            counts[x] += 1
    return max(counts.values())
 
 
results = sorted(results, key=lambda x: max_identical(x['result']))
 
print [x['id'] for x in results]

英文:

If you sort those items by the "highest number of identical results", this is what you get:

[&#39;003&#39;, &#39;004&#39;, &#39;005&#39;, &#39;006&#39;, &#39;001&#39;, &#39;002&#39;]

If you meant something else (i.e. not "highest number of identical results"), please clarify your question. Also, you can simply modify the max_identical function so that it acts according to your definition of similar.

The above result was computed with:

from collections import defaultdict
 
 
results = [{&#39;id&#39;: &#39;001&#39;, &#39;result&#39;: [0, 0, 0, 0, 1]},
           {&#39;id&#39;: &#39;002&#39;, &#39;result&#39;: [1, 1, 1, 1, 1]},
           {&#39;id&#39;: &#39;003&#39;, &#39;result&#39;: [0, 1, 1, None, None]},
           {&#39;id&#39;: &#39;004&#39;, &#39;result&#39;: [0, None, None, 1, 0]},
           {&#39;id&#39;: &#39;005&#39;, &#39;result&#39;: [1, 0, None, 1, 1]},
           {&#39;id&#39;: &#39;006&#39;, &#39;result&#39;: [0, 0, 0, 1, 1]}]
 
 
def max_identical(lst):
    counts = defaultdict(lambda: 0)
    for x in lst:
        if x is not None:
            counts[x] += 1
    return max(counts.values())
 
 
results = sorted(results, key=lambda x: max_identical(x[&#39;result&#39;]))
 
print [x[&#39;id&#39;] for x in results]

答案2

得分: 0

在寻找与您的问题非常相似的解决方案时，我找到了这个页面：http://w3facility.org/question/sorting-a-python-dictionary-after-running-an-itertools-function/

使用您的示例代码：

import itertools
results = [[{'id': '001', 'result': [0,0,0,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [0,0,0,1,1]}],
          [{'id': '001', 'result': [1,0,1,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [1,0,1,0,1]}]
          ]

这将创建一个所有id之间的全对比，每一轮都会进行。

similarity = {}
for p1, p2 in itertools.combinations(results[0], 2):
    similarity.setdefault((p1["id"], p2["id"]), sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]]))
for p1, p2 in itertools.combinations(results[1], 2):
    similarity.setdefault((p1["id"], p2["id"]), 0)
    similarity[(p1["id"], p2["id"])] += sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]])

现在，要按匹配值对id对进行排序，将返回一个有序的id元组列表。

similarity = sorted(similarity, key=lambda x:similarity[x], reverse=True)
print(similarity)

现在，要消除重复的值，只需要保留每个id的第一次出现，按照顺序忽略其他的。

sorted_ids = []
for tuple_id in similarity:
    if tuple_id[0] not in sorted_ids:
        sorted_ids.append(tuple_id[0])
    if tuple_id[1] not in sorted_ids:
        sorted_ids.append(tuple_id[1])

print(sorted_ids)

英文:

Looking for a solution for a problem very similar to yours I found this page:
http://w3facility.org/question/sorting-a-python-dictionary-after-running-an-itertools-function/

Using your example:

import itertools
results = [[{&#39;id&#39;: &#39;001&#39;, &#39;result&#39;: [0,0,0,0,1]},
{&#39;id&#39;: &#39;002&#39;, &#39;result&#39;: [1,1,1,1,1]},
{&#39;id&#39;: &#39;003&#39;, &#39;result&#39;: [0,1,1,None,None]},
{&#39;id&#39;: &#39;004&#39;, &#39;result&#39;: [0,None,None,1,0]},
{&#39;id&#39;: &#39;005&#39;, &#39;result&#39;: [1,0,None,1,1]},
{&#39;id&#39;: &#39;006&#39;, &#39;result&#39;: [0,0,0,1,1]}],
[{&#39;id&#39;: &#39;001&#39;, &#39;result&#39;: [1,0,1,0,1]},
{&#39;id&#39;: &#39;002&#39;, &#39;result&#39;: [1,1,1,1,1]},
{&#39;id&#39;: &#39;003&#39;, &#39;result&#39;: [0,1,1,None,None]},
{&#39;id&#39;: &#39;004&#39;, &#39;result&#39;: [0,None,None,1,0]},
{&#39;id&#39;: &#39;005&#39;, &#39;result&#39;: [1,0,None,1,1]},
{&#39;id&#39;: &#39;006&#39;, &#39;result&#39;: [1,0,1,0,1]}]
]

This will create an all vs all comparison of the ids, each for for each round.

similarity = {}
for p1, p2 in itertools.combinations(results[0], 2):
similarity.setdefault((p1[&quot;id&quot;], p2[&quot;id&quot;]), sum([1 for i in range(len(p1[&quot;result&quot;])) if p1[&quot;result&quot;][i] == p2[&quot;result&quot;][i]]))
for p1, p2 in itertools.combinations(results[1], 2):
similarity.setdefault((p1[&quot;id&quot;], p2[&quot;id&quot;]), 0)
similarity[(p1[&quot;id&quot;], p2[&quot;id&quot;])] += sum([1 for i in range(len(p1[&quot;result&quot;])) if p1[&quot;result&quot;][i] == p2[&quot;result&quot;][i]])

Now to sort the ids pairs by their matching values, will return a list of ordered tuples of ids.

similarity = sorted(similarity, key=lambda x:similarity[x], reverse=True)
print(similarity)

Now to eliminate the duplicate values, it is just necessary to retain the first occurence of each id, in that order and forget of the rest.

sorted_ids = []
for tuple_id in similarity:
if tuple_id[0] not in sorted_ids:
sorted_ids.append(tuple_id[0])
if tuple_id[1] not in sorted_ids:
sorted_ids.append(tuple_id[1])
print sorted_ids

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据最高相似度对字典列表进行值排序

问题

答案1

答案2

Can I have insecure GET HTTP requests whilst having MTLS securing all other HTTP requests?

在Go语言中，互斥锁（Mutex）需要进行初始化吗？

覆盖函数输入的空间复杂度

XML in Go – how to take either tag and match it to the field of a struct?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论