计算相似值在字典列表中的出现次数

huangapple go评论72阅读模式
英文:

count similar values from list of dictionaries

问题

我有一个字典列表,需要统计唯一条目,然后根据键"corrected_word"中的元组值排序(2 < 3 < 33)

期望输出:

mylist = [
{'original_word': 'test2', 'corrected_word': ('test22', 2, 1)},
{'original_word': 'test1', 'corrected_word': ('test12', 3, 2)},
{'original_word': 'test3', 'corrected_word': ('test3', 33, 3)}
]

我尝试了这个:

from collections import Counter
Counter([str(i) for i in mylist])

但它不返回字典列表。

英文:

I have a list of dictionaries and I need to count unique entries.
Then I need to sort the values based on the tuple that is part of the key "corrected_word" (2 < 3 < 33)

mylist = [
{&#39;original_word&#39;: &#39;test1&#39;, &#39;corrected_word&#39;: (&#39;test12&#39;, 3)},
{&#39;original_word&#39;: &#39;test1&#39;, &#39;corrected_word&#39;: (&#39;test12&#39;, 3)},
{&#39;original_word&#39;: &#39;test2&#39;, &#39;corrected_word&#39;: (&#39;test22&#39;, 2)},
{&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33)},
{&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33)},
{&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33)}
]

Expected Output:

mylist = [
{&#39;original_word&#39;: &#39;test2&#39;, &#39;corrected_word&#39;: (&#39;test22&#39;, 2, 1)},
{&#39;original_word&#39;: &#39;test1&#39;, &#39;corrected_word&#39;: (&#39;test12&#39;, 3, 2)},
{&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33, 3)}
]

I have tried this:

from collections import Counter
Counter([str(i) for i in mylist])

But it does not return the list of dictionaries.

答案1

得分: 1

  1. 在计数之前转换为元组
  2. 转换回字典并添加计数
  3. 基于数字排序
def dict_and_add_count(item):
    original_data, count = item
    original_dict = dict(original_data)
    original_dict['corrected_word'] = (*original_dict['corrected_word'], count)
    return original_dict

counted_unique_tuples = Counter(tuple(d.items()) for d in mylist)
dict_with_count = map(dict_and_add_count, counted_unique_tuples.items())
sorted_dicts = sorted(dict_with_count, key=lambda x: x['corrected_word'][1])

排序后的字典将是:

[{'original_word': 'test2', 'corrected_word': ('test22', 2, 1)},
 {'original_word': 'test1', 'corrected_word': ('test12', 3, 2)},
 {'original_word': 'test3', 'corrected_word': ('test3', 33, 3)}]
英文:
  1. convert to tuples before counter
  2. convert back to dicts and add the count
  3. sort based on number
def dict_and_add_count(item):
    original_data, count = item
    original_dict = dict(original_data)
    original_dict[&#39;corrected_word&#39;] = (*original_dict[&#39;corrected_word&#39;], count)
    return original_dict

counted_unique_tuples = Counter(tuple(d.items()) for d in mylist)
dict_with_count = map(dict_and_add_count, counted_unique_tuples.items())
sorted_dicts = sorted(dict_with_count, key=lambda x: x[&#39;corrected_word&#39;][1])

sorted_dicts will be

[{&#39;original_word&#39;: &#39;test2&#39;, &#39;corrected_word&#39;: (&#39;test22&#39;, 2, 1)},
{&#39;original_word&#39;: &#39;test1&#39;, &#39;corrected_word&#39;: (&#39;test12&#39;, 3, 2)},
{&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33, 3)}]

答案2

得分: 1

创建一个元组列表,其中每个元组的第一个元素是原始单词,其余元素是相应的 corrected_word 元组中的元素。然后将此列表传递给 Counter

from collections import Counter
ctr = Counter(((item['original_word'], *item['corrected_word']) for item in mylist))

这将得到:

Counter({'test3': 3, 'test1': 2, 'test2': 1})

然后,构建您的结果列表并按您想要的值对其进行排序:

result = sorted([
          {'original_word': ow, 'corrected_word': (*cw, count)} for (ow, *cw), count in ctr.items()
          ], key=lambda item: item['corrected_word'][1])

这将得到所期望的结果:

[
 {'original_word': 'test2', 'corrected_word': ('test22', 2, 1)},
 {'original_word': 'test1', 'corrected_word': ('test12', 3, 2)},
 {'original_word': 'test3', 'corrected_word': ('test3', 3, 3)}
]

在线尝试!

英文:

Create a list of tuples, where the first element of each tuple is the original word, and the remaining elements are the elements in the corresponding corrected_word tuple. Then put this list through Counter

from collections import Counter
ctr = Counter(((item[&#39;original_word&#39;], *item[&#39;corrected_word&#39;]) for item in mylist))

This gives:

Counter({(&#39;test3&#39;, &#39;test3&#39;, 33): 3, (&#39;test1&#39;, &#39;test12&#39;, 3): 2, (&#39;test2&#39;, &#39;test22&#39;, 2): 1})

Then, build your result list and sort it by the value you want:

result = sorted([
          {&#39;original_word&#39;: ow, &#39;corrected_word&#39;: (*cw, count)} for (ow, *cw), count in ctr.items()
          ], key=lambda item: item[&#39;corrected_word&#39;][1])

Which gives the desired result:

[
 {&#39;original_word&#39;: &#39;test2&#39;, &#39;corrected_word&#39;: (&#39;test22&#39;, 2, 1)},
 {&#39;original_word&#39;: &#39;test1&#39;, &#39;corrected_word&#39;: (&#39;test12&#39;, 3, 2)},
 {&#39;original_word&#39;: &#39;test3&#39;, &#39;corrected_word&#39;: (&#39;test3&#39;, 33, 3)}
]

[Try it online!](https://tio.run/##xVHLasMwELzrK/YmKZhCrFsgp36GMcWVlVZU1prVhmBKvt2V4jSUNPhWqpOY8TzkGSd@x2jm@UA4gMUQnGWPMYEfRiSGZzxGdiTEMAWfGPbQiE@J5N987MLLCamXO5DsEm9lBdIiUbZw/TelFq7OpNHn6k/E9Yq4LmS9IjYr4sIZ859i0QphmfJ/vy6hlPLshubOsq1gs@B3hq2GAxIUDnyEZUathRjJR1bZO9/JpWMo46a8uetVI@B2fpfH08PeG5txW1rq8yVTlQ8zqq9wyc95T6VLUvpHRm7/4aZ96IbXvrt03cHj1zTb9tZ9aa3n@Qs "Python 3 – Try It Online")

huangapple
  • 本文由 发表于 2023年2月24日 13:51:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/75553033.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定