在Python中对JSON对象中的值求和

huangapple go评论66阅读模式
英文:

Summing values in json objects in Python

问题

这是两个JSON对象。我想要将它们合并,但在键相同的地方,字段"obj_count"应该被累加。在Python中是否有解决方法?

以下是一个示例:
这是第一个JSON对象:

[
    {"text": "pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
    {"text": "watercolour", "id": "x33202 ", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
]

这是第二个JSON对象:

[
    {"text": "pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
    {"text": "watercolour", "id": "x33202 ", "obj_count": 97},
    {"text": "ink", "id": "AAT15012 ", "obj_count": 297}
]

我想要的结果如下:

[
   {"text":"pen and ink and watercolour","id":"x32505 ","obj_count": 2662 #累加},
   {"text":"watercolour","id":"x33202 ","obj_count": 771 #累加},
   {"text":"ink","id":"AAT15012 ","obj_count":297},
   {"text":"pencil","id":"AAT16013 ","obj_count":297}
]
英文:

I have two JSON objects. I want to merge them but wherever the keys are the same the field obj_count should be summed. Is there any way around it in python?

Here is an example of it:
This is the 1st JSON object

[
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
]

And here is the second json object

[
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 97},
    {"text": " ink", "id": "AAT15012 ", "obj_count": 297}
]

What I want is something like this:

[
   {"text":" pen and ink and watercolour","id":"x32505 ","obj_count": 2662 #summed},
   {"text":" watercolour","id":"x33202 ","obj_count": 771 #summed},
   {"text":" ink","id":"AAT15012 ","obj_count":297},
   {"text":"pencil","id":"AAT16013 ","obj_count":297}
]

答案1

得分: 3

使用一个dict来存储是否已经见过一个id

  • 如果已经见过,将它们的obj_count相加
  • 如果没有见过,只需保存该项
values_a = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 1855},
    {"text": "watercolour", "id": "x33202", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013", "obj_count": 297}
]

values_b = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 807},
    {"text": "watercolour", "id": "x33202", "obj_count": 97},
    {"text": "ink", "id": "AAT15012", "obj_count": 297}
]

result = {}
for item in [*values_a, *values_b]:
    if item['id'] in result:
        result[item['id']]['obj_count'] += item['obj_count']
    else:
        result[item['id']] = item

# 转回项目列表
result = list(result.values())
英文:

Use a dict to store whether you have seen an id or not

  • if you have, sum their obj_count
  • if you haven't, just save the item

<!-- -->

values_a = [
    {&quot;text&quot;: &quot; pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505 &quot;, &quot;obj_count&quot;: 1855},
    {&quot;text&quot;: &quot; watercolour&quot;, &quot;id&quot;: &quot;x33202 &quot;, &quot;obj_count&quot;: 674},
    {&quot;text&quot;: &quot;pencil&quot;, &quot;id&quot;: &quot;AAT16013 &quot;, &quot;obj_count&quot;: 297}
]

values_b = [
    {&quot;text&quot;: &quot; pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505 &quot;, &quot;obj_count&quot;: 807},
    {&quot;text&quot;: &quot; watercolour&quot;, &quot;id&quot;: &quot;x33202 &quot;, &quot;obj_count&quot;: 97},
    {&quot;text&quot;: &quot; ink&quot;, &quot;id&quot;: &quot;AAT15012 &quot;, &quot;obj_count&quot;: 297}
]

result = {}
for item in [*values_a, *values_b]:
    if item[&#39;id&#39;] in result:
        result[item[&#39;id&#39;]][&#39;obj_count&#39;] += item[&#39;obj_count&#39;]
    else:
        result[item[&#39;id&#39;]] = item

# back to list of items
result = list(result.values())

答案2

得分: 1

是的

可以使用json模块来进行加载和保存(以下代码未使用)

def sum_list_of_dict(source, add):
    for add_elem in add:
        found = False
        for source_elem in source:
            if add_elem["id"] == source_elem["id"]:
                source_elem["obj_count"] += add_elem["obj_count"]
                found = True
                break  # 不应该存在重复项
        if not found:
            source.append(add_elem)
    return source


data1 = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 1855},
    {"text": "watercolour", "id": "x33202", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013", "obj_count": 297},
]

data2 = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 807},
    {"text": "watercolour", "id": "x33202", "obj_count": 97},
    {"text": "ink", "id": "AAT15012", "obj_count": 297},
]

data3 = sum_list_of_dict(data1, data2)

# 仅用于美观打印
from pprint import pprint
pprint(data3)

输出

[{'id': 'x32505', 'obj_count': 2662, 'text': 'pen and ink and watercolour'},
 {'id': 'x33202', 'obj_count': 771, 'text': 'watercolour'},
 {'id': 'AAT16013', 'obj_count': 297, 'text': 'pencil'},
 {'id': 'AAT15012', 'obj_count': 297, 'text': 'ink'}]
英文:

Yes

Any loading/saving can be done with the json module (not used below though)

def sum_list_of_dict(source, add):
    for add_elem in add:
        found = False
        for source_elem in source:
            if add_elem[&quot;id&quot;] == source_elem[&quot;id&quot;]:
                source_elem[&quot;obj_count&quot;] += add_elem[&quot;obj_count&quot;]
                found = True
                break  # dupes should not be present
        if not found:
            source.append(add_elem)
    return source


data1 = [
    {&quot;text&quot;: &quot;pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505&quot;, &quot;obj_count&quot;: 1855},
    {&quot;text&quot;: &quot;watercolour&quot;, &quot;id&quot;: &quot;x33202&quot;, &quot;obj_count&quot;: 674},
    {&quot;text&quot;: &quot;pencil&quot;, &quot;id&quot;: &quot;AAT16013&quot;, &quot;obj_count&quot;: 297},
]

data2 = [
    {&quot;text&quot;: &quot;pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505&quot;, &quot;obj_count&quot;: 807},
    {&quot;text&quot;: &quot;watercolour&quot;, &quot;id&quot;: &quot;x33202&quot;, &quot;obj_count&quot;: 97},
    {&quot;text&quot;: &quot;ink&quot;, &quot;id&quot;: &quot;AAT15012&quot;, &quot;obj_count&quot;: 297},
]

data3 = sum_list_of_dict(data1, data2)

# just for pretty printing
from pprint import pprint
pprint(data3)

output

[{&#39;id&#39;: &#39;x32505&#39;, &#39;obj_count&#39;: 2662, &#39;text&#39;: &#39;pen and ink and watercolour&#39;},
 {&#39;id&#39;: &#39;x33202&#39;, &#39;obj_count&#39;: 771, &#39;text&#39;: &#39;watercolour&#39;},
 {&#39;id&#39;: &#39;AAT16013&#39;, &#39;obj_count&#39;: 297, &#39;text&#39;: &#39;pencil&#39;},
 {&#39;id&#39;: &#39;AAT15012&#39;, &#39;obj_count&#39;: 297, &#39;text&#39;: &#39;ink&#39;}]

huangapple
  • 本文由 发表于 2023年2月9日 01:16:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/75389400.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定