2023年3月4日 00:40:01go评论157阅读模式

英文:

Python: remove duplicate in json from 2 key value

问题

我有一个像下面这样组织的JSON文件，我想删除两个键对元素中的所有重复项。

[{'name': 'anna', 'city': 'paris','code': '5'},  
{'name': 'anna', 'city': 'paris','code': '2'},
{'name': 'henry', 'city': 'london','code': '1'},
{'name': 'henry', 'city': 'london','code': '3'},...]

期望的输出是：

[{'name': 'anna', 'city': 'paris'}, {'name': 'henry', 'city': 'london'}]

我在这个任务中遇到困难，有什么想法？

英文:

I have a json file organised like the following one and I would like to delete all duplicated from 2 key pairs element

[{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;,&#39;code&#39;: &#39;5&#39;},  
{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;,&#39;code&#39;: &#39;2&#39;},
{&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;,&#39;code&#39;: &#39;1&#39;},
{&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;,&#39;code&#39;: &#39;3&#39;},...]

expected outpout

[{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;},{&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;}]

I am struggling with this task, any ideas?

答案1

得分: 0

你需要为（姓名，城市）创建一个唯一的键，对于具有相同配对的记录，只需应用在最终结果中保留什么条件。

完成后，获取这些值，这就是答案。

使用 walrus 运算符和 dict-comprehension

l = [{'name': 'anna', 'city': 'paris', 'code': '5'}, {'name': 'anna', 'city': 'paris', 'code': '2'}, {'name': 'henry', 'city': 'london', 'code': '1'}, {'name': 'henry', 'city': 'london', 'code': '3'}]
result = { (name:= subdict['name'], city:= subdict['city']): dict(name=name, city=city) for subdict in l}
result
{('anna', 'paris'): {'name': 'anna', 'city': 'paris'}, ('henry', 'london'): {'name': 'henry', 'city': 'london'}}
solution = list(result.values())
solution
[{'name': 'anna', 'city': 'paris'}, {'name': 'henry', 'city': 'london'}]

英文:

you need to make a unique key for (name, city) and for record whose have same pair just need to apply the condition of what to keep in the final result.

once done, get the values and that is the answer.

with walrus operator and dict-comprehension

&gt;&gt;&gt; l = [{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;, &#39;code&#39;: &#39;5&#39;}, {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;, &#39;code&#39;: &#39;2&#39;}, {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;, &#39;code&#39;: &#39;1&#39;}, {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;, &#39;code&#39;: &#39;3&#39;}]
&gt;&gt;&gt; result = { (name:= subdict[&#39;name&#39;], city:= subdict[&#39;city&#39;]): dict(name=name, city=city) for subdict in l}
&gt;&gt;&gt; result
{(&#39;anna&#39;, &#39;paris&#39;): {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;}, (&#39;henry&#39;, &#39;london&#39;): {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;}}
&gt;&gt;&gt; solution = list(result.values())
&gt;&gt;&gt; solution
[{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;}, {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;}]

答案2

得分: 0

在纯Python中，您可以选择字典行中所需的内容，使用可哈希的元组集合（用{}表示），然后使用所选内容重新构建行。

items = [
    {'name': 'anna', 'city': 'paris','code': '5'},
    {'name': 'anna', 'city': 'paris','code': '2'},
    {'name': 'henry', 'city': 'london','code': '1'},
    {'name': 'henry', 'city': 'london','code': '3'}
]

unique = {(item["name"], item["city"]) for item in items}

unique = [{"name": item[0], "city": item[1]} for item in unique]

英文:

In pure python you can select what you need in dictionnary rows, use set collection of hashable row like tuple (with {}) and then rebuild your rows with what you selected

items = [
    {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;,&#39;code&#39;: &#39;5&#39;},
    {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;,&#39;code&#39;: &#39;2&#39;},
    {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;,&#39;code&#39;: &#39;1&#39;},
    {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;,&#39;code&#39;: &#39;3&#39;}
]

unique = {(item[&quot;name&quot;], item[&quot;city&quot;]) for item in items}

unique = [{&quot;name&quot;: item[0], &quot;city&quot;: item[1]} for item in unique]

答案3

得分: 0

这里是另一种方法（其中之一），使用集合，代码如下：

input_list = [
    {'name': 'anna', 'city': 'paris', 'code': '5'},
    {'name': 'anna', 'city': 'paris', 'code': '2'},
    {'name': 'henry', 'city': 'london', 'code': '1'},
    {'name': 'henry', 'city': 'london', 'code': '3'}
]

output_list = []
unique_names = set()

for d in input_list:
    if (name := d.get('name')) not in unique_names:
        output_list.append({k: v for k, v in d.items() if k != 'code'})
        unique_names.add(name)

print(output_list)

输出:

[{'name': 'anna', 'city': 'paris'}, {'name': 'henry', 'city': 'london'}]

注意:

这种方法至少有一个好处。其他答案构建新的字典时会包含键' name '和' city '，并且隐含地忽略' code '，对于所示数据是可以的。但是，这种方法构建新的字典不包括' code '。这意味着字典结构（输入数据）可以更改，而不必修改功能代码 - 即，' code '键可以不存在，并且除了' name '和' city '之外的键/值对可以被引入。

英文:

Here's another approach (one of many) that utilises a set as follows:

input_list = [
    {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;, &#39;code&#39;: &#39;5&#39;},
    {&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;, &#39;code&#39;: &#39;2&#39;},
    {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;, &#39;code&#39;: &#39;1&#39;},
    {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;, &#39;code&#39;: &#39;3&#39;}
]

output_list = []
unique_names = set()

for d in input_list:
    if (name := d.get(&#39;name&#39;)) not in unique_names:
        output_list.append({k: v for k, v in d.items() if k != &#39;code&#39;})
        unique_names.add(name)

print(output_list)

Output:

[{&#39;name&#39;: &#39;anna&#39;, &#39;city&#39;: &#39;paris&#39;}, {&#39;name&#39;: &#39;henry&#39;, &#39;city&#39;: &#39;london&#39;}]

Note:

There's at least one benefit of doing it this way. Other answers are building the new dictionaries to include keys 'name' and 'city' and implicitly ignore 'code' which is fine for the data as shown. However, this approach builds the new dictionaries excluding 'code'. What this means is that the dictionary structures (the input data) can change without having to alter the functional code - i.e., the 'code' key could be absent and key/value pairs in addition to 'name' and 'city' could be introduced

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python: 从 JSON 中删除具有相同键值的重复项

问题

答案1

答案2

答案3

在Firebase上使用Python为视频添加音频。

JSON解析中是否有一个”any”标签？

有没有更好的方法来接收输入并将其与命令列表进行检查？

使用AbstractBaseUser和BaseUserManager在Django RestFramework中定制模型。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论