英文:
Sort list of dictionaries based on values with highest similarity
问题
给定以下的Python字典列表:
results = [[{'id': '001', 'result': [0,0,0,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [0,0,0,1,1]}],
[{'id': '001', 'result': [1,0,1,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [1,0,1,0,1]}]
]
我想根据'result'的值生成一个新的排序列表(使用Python和Golang),通过比较每个组中的玩家('id')之间的结果,然后根据匹配条目的数量进行排序(None结果被丢弃并且不计数):
在第一轮和第二轮中,001和006有九个匹配的答案:
001 = [0,0,0,0,1] 006 = [0,0,0,1,1] - 四个匹配的答案。
在第二轮中,001和006有五个匹配的答案:
001 = [1,0,1,0,1] 006 = [1,0,1,0,1] - 五个匹配的答案
sorted_results = ['001','006','002','005','003','004']
'001'和'006'是列表中的前两个项目,因为它们具有最高数量的匹配结果 - 九个。
英文:
Given the following python list of dictionaries:
results = [[{'id': '001', 'result': [0,0,0,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [0,0,0,1,1]}],
[{'id': '001', 'result': [1,0,1,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [1,0,1,0,1]}]
]
I would like to generate a new sorted list (in both python and golang) based on the values of 'result' by comparing results between the players ('id') in each group and then sorting them based on the number of matching entries (None results are discarded and not counted): <br>
During the first round and second round 001 and 006 had nine matching answers:<br>
001 = [0,0,0,0,1] 006 = [0,0,0,1,1] - four matching answers.<br>
During the second round, 001 and 006 had five matching answers:<br>
001 = [1,0,1,0,1] 006 = [1,0,1,0,1] - five matching answers
sorted_results = ['001','006','002','005','003','004']
'001' and '006' are the first two items in the list because they have the highest number of matching results - nine.
答案1
得分: 1
如果按照“相同结果最多的数量”对这些项目进行排序,得到的结果如下:
['003', '004', '005', '006', '001', '002']
如果你的意思不是“相同结果最多的数量”,请澄清你的问题。另外,你可以简单修改max_identical
函数,使其根据你对相似的定义进行操作。
上述结果是通过以下代码计算得出的:
from collections import defaultdict
results = [{'id': '001', 'result': [0, 0, 0, 0, 1]},
{'id': '002', 'result': [1, 1, 1, 1, 1]},
{'id': '003', 'result': [0, 1, 1, None, None]},
{'id': '004', 'result': [0, None, None, 1, 0]},
{'id': '005', 'result': [1, 0, None, 1, 1]},
{'id': '006', 'result': [0, 0, 0, 1, 1]}]
def max_identical(lst):
counts = defaultdict(lambda: 0)
for x in lst:
if x is not None:
counts[x] += 1
return max(counts.values())
results = sorted(results, key=lambda x: max_identical(x['result']))
print [x['id'] for x in results]
英文:
If you sort those items by the "highest number of identical results", this is what you get:
['003', '004', '005', '006', '001', '002']
If you meant something else (i.e. not "highest number of identical results"), please clarify your question. Also, you can simply modify the max_identical
function so that it acts according to your definition of similar.
The above result was computed with:
from collections import defaultdict
results = [{'id': '001', 'result': [0, 0, 0, 0, 1]},
{'id': '002', 'result': [1, 1, 1, 1, 1]},
{'id': '003', 'result': [0, 1, 1, None, None]},
{'id': '004', 'result': [0, None, None, 1, 0]},
{'id': '005', 'result': [1, 0, None, 1, 1]},
{'id': '006', 'result': [0, 0, 0, 1, 1]}]
def max_identical(lst):
counts = defaultdict(lambda: 0)
for x in lst:
if x is not None:
counts[x] += 1
return max(counts.values())
results = sorted(results, key=lambda x: max_identical(x['result']))
print [x['id'] for x in results]
答案2
得分: 0
在寻找与您的问题非常相似的解决方案时,我找到了这个页面:http://w3facility.org/question/sorting-a-python-dictionary-after-running-an-itertools-function/
使用您的示例代码:
import itertools
results = [[{'id': '001', 'result': [0,0,0,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [0,0,0,1,1]}],
[{'id': '001', 'result': [1,0,1,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [1,0,1,0,1]}]
]
这将创建一个所有id之间的全对比,每一轮都会进行。
similarity = {}
for p1, p2 in itertools.combinations(results[0], 2):
similarity.setdefault((p1["id"], p2["id"]), sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]]))
for p1, p2 in itertools.combinations(results[1], 2):
similarity.setdefault((p1["id"], p2["id"]), 0)
similarity[(p1["id"], p2["id"])] += sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]])
现在,要按匹配值对id对进行排序,将返回一个有序的id元组列表。
similarity = sorted(similarity, key=lambda x:similarity[x], reverse=True)
print(similarity)
现在,要消除重复的值,只需要保留每个id的第一次出现,按照顺序忽略其他的。
sorted_ids = []
for tuple_id in similarity:
if tuple_id[0] not in sorted_ids:
sorted_ids.append(tuple_id[0])
if tuple_id[1] not in sorted_ids:
sorted_ids.append(tuple_id[1])
print(sorted_ids)
英文:
Looking for a solution for a problem very similar to yours I found this page:
http://w3facility.org/question/sorting-a-python-dictionary-after-running-an-itertools-function/
Using your example:
import itertools
results = [[{'id': '001', 'result': [0,0,0,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [0,0,0,1,1]}],
[{'id': '001', 'result': [1,0,1,0,1]},
{'id': '002', 'result': [1,1,1,1,1]},
{'id': '003', 'result': [0,1,1,None,None]},
{'id': '004', 'result': [0,None,None,1,0]},
{'id': '005', 'result': [1,0,None,1,1]},
{'id': '006', 'result': [1,0,1,0,1]}]
]
This will create an all vs all comparison of the ids, each for for each round.
similarity = {}
for p1, p2 in itertools.combinations(results[0], 2):
similarity.setdefault((p1["id"], p2["id"]), sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]]))
for p1, p2 in itertools.combinations(results[1], 2):
similarity.setdefault((p1["id"], p2["id"]), 0)
similarity[(p1["id"], p2["id"])] += sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]])
Now to sort the ids pairs by their matching values, will return a list of ordered tuples of ids.
similarity = sorted(similarity, key=lambda x:similarity[x], reverse=True)
print(similarity)
Now to eliminate the duplicate values, it is just necessary to retain the first occurence of each id, in that order and forget of the rest.
sorted_ids = []
for tuple_id in similarity:
if tuple_id[0] not in sorted_ids:
sorted_ids.append(tuple_id[0])
if tuple_id[1] not in sorted_ids:
sorted_ids.append(tuple_id[1])
print sorted_ids
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论