在Python中对JSON对象中的值求和

huangapple go评论94阅读模式
英文:

Summing values in json objects in Python

问题

这是两个JSON对象。我想要将它们合并,但在键相同的地方,字段"obj_count"应该被累加。在Python中是否有解决方法?

以下是一个示例:
这是第一个JSON对象:

  1. [
  2. {"text": "pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
  3. {"text": "watercolour", "id": "x33202 ", "obj_count": 674},
  4. {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
  5. ]

这是第二个JSON对象:

  1. [
  2. {"text": "pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
  3. {"text": "watercolour", "id": "x33202 ", "obj_count": 97},
  4. {"text": "ink", "id": "AAT15012 ", "obj_count": 297}
  5. ]

我想要的结果如下:

  1. [
  2. {"text":"pen and ink and watercolour","id":"x32505 ","obj_count": 2662 #累加},
  3. {"text":"watercolour","id":"x33202 ","obj_count": 771 #累加},
  4. {"text":"ink","id":"AAT15012 ","obj_count":297},
  5. {"text":"pencil","id":"AAT16013 ","obj_count":297}
  6. ]
英文:

I have two JSON objects. I want to merge them but wherever the keys are the same the field obj_count should be summed. Is there any way around it in python?

Here is an example of it:
This is the 1st JSON object

  1. [
  2. {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
  3. {"text": " watercolour", "id": "x33202 ", "obj_count": 674},
  4. {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
  5. ]

And here is the second json object

  1. [
  2. {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
  3. {"text": " watercolour", "id": "x33202 ", "obj_count": 97},
  4. {"text": " ink", "id": "AAT15012 ", "obj_count": 297}
  5. ]

What I want is something like this:

  1. [
  2. {"text":" pen and ink and watercolour","id":"x32505 ","obj_count": 2662 #summed},
  3. {"text":" watercolour","id":"x33202 ","obj_count": 771 #summed},
  4. {"text":" ink","id":"AAT15012 ","obj_count":297},
  5. {"text":"pencil","id":"AAT16013 ","obj_count":297}
  6. ]

答案1

得分: 3

使用一个dict来存储是否已经见过一个id

  • 如果已经见过,将它们的obj_count相加
  • 如果没有见过,只需保存该项
  1. values_a = [
  2. {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 1855},
  3. {"text": "watercolour", "id": "x33202", "obj_count": 674},
  4. {"text": "pencil", "id": "AAT16013", "obj_count": 297}
  5. ]
  6. values_b = [
  7. {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 807},
  8. {"text": "watercolour", "id": "x33202", "obj_count": 97},
  9. {"text": "ink", "id": "AAT15012", "obj_count": 297}
  10. ]
  11. result = {}
  12. for item in [*values_a, *values_b]:
  13. if item['id'] in result:
  14. result[item['id']]['obj_count'] += item['obj_count']
  15. else:
  16. result[item['id']] = item
  17. # 转回项目列表
  18. result = list(result.values())
英文:

Use a dict to store whether you have seen an id or not

  • if you have, sum their obj_count
  • if you haven't, just save the item

<!-- -->

  1. values_a = [
  2. {&quot;text&quot;: &quot; pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505 &quot;, &quot;obj_count&quot;: 1855},
  3. {&quot;text&quot;: &quot; watercolour&quot;, &quot;id&quot;: &quot;x33202 &quot;, &quot;obj_count&quot;: 674},
  4. {&quot;text&quot;: &quot;pencil&quot;, &quot;id&quot;: &quot;AAT16013 &quot;, &quot;obj_count&quot;: 297}
  5. ]
  6. values_b = [
  7. {&quot;text&quot;: &quot; pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505 &quot;, &quot;obj_count&quot;: 807},
  8. {&quot;text&quot;: &quot; watercolour&quot;, &quot;id&quot;: &quot;x33202 &quot;, &quot;obj_count&quot;: 97},
  9. {&quot;text&quot;: &quot; ink&quot;, &quot;id&quot;: &quot;AAT15012 &quot;, &quot;obj_count&quot;: 297}
  10. ]
  11. result = {}
  12. for item in [*values_a, *values_b]:
  13. if item[&#39;id&#39;] in result:
  14. result[item[&#39;id&#39;]][&#39;obj_count&#39;] += item[&#39;obj_count&#39;]
  15. else:
  16. result[item[&#39;id&#39;]] = item
  17. # back to list of items
  18. result = list(result.values())

答案2

得分: 1

是的

可以使用json模块来进行加载和保存(以下代码未使用)

  1. def sum_list_of_dict(source, add):
  2. for add_elem in add:
  3. found = False
  4. for source_elem in source:
  5. if add_elem["id"] == source_elem["id"]:
  6. source_elem["obj_count"] += add_elem["obj_count"]
  7. found = True
  8. break # 不应该存在重复项
  9. if not found:
  10. source.append(add_elem)
  11. return source
  12. data1 = [
  13. {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 1855},
  14. {"text": "watercolour", "id": "x33202", "obj_count": 674},
  15. {"text": "pencil", "id": "AAT16013", "obj_count": 297},
  16. ]
  17. data2 = [
  18. {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 807},
  19. {"text": "watercolour", "id": "x33202", "obj_count": 97},
  20. {"text": "ink", "id": "AAT15012", "obj_count": 297},
  21. ]
  22. data3 = sum_list_of_dict(data1, data2)
  23. # 仅用于美观打印
  24. from pprint import pprint
  25. pprint(data3)

输出

  1. [{'id': 'x32505', 'obj_count': 2662, 'text': 'pen and ink and watercolour'},
  2. {'id': 'x33202', 'obj_count': 771, 'text': 'watercolour'},
  3. {'id': 'AAT16013', 'obj_count': 297, 'text': 'pencil'},
  4. {'id': 'AAT15012', 'obj_count': 297, 'text': 'ink'}]
英文:

Yes

Any loading/saving can be done with the json module (not used below though)

  1. def sum_list_of_dict(source, add):
  2. for add_elem in add:
  3. found = False
  4. for source_elem in source:
  5. if add_elem[&quot;id&quot;] == source_elem[&quot;id&quot;]:
  6. source_elem[&quot;obj_count&quot;] += add_elem[&quot;obj_count&quot;]
  7. found = True
  8. break # dupes should not be present
  9. if not found:
  10. source.append(add_elem)
  11. return source
  12. data1 = [
  13. {&quot;text&quot;: &quot;pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505&quot;, &quot;obj_count&quot;: 1855},
  14. {&quot;text&quot;: &quot;watercolour&quot;, &quot;id&quot;: &quot;x33202&quot;, &quot;obj_count&quot;: 674},
  15. {&quot;text&quot;: &quot;pencil&quot;, &quot;id&quot;: &quot;AAT16013&quot;, &quot;obj_count&quot;: 297},
  16. ]
  17. data2 = [
  18. {&quot;text&quot;: &quot;pen and ink and watercolour&quot;, &quot;id&quot;: &quot;x32505&quot;, &quot;obj_count&quot;: 807},
  19. {&quot;text&quot;: &quot;watercolour&quot;, &quot;id&quot;: &quot;x33202&quot;, &quot;obj_count&quot;: 97},
  20. {&quot;text&quot;: &quot;ink&quot;, &quot;id&quot;: &quot;AAT15012&quot;, &quot;obj_count&quot;: 297},
  21. ]
  22. data3 = sum_list_of_dict(data1, data2)
  23. # just for pretty printing
  24. from pprint import pprint
  25. pprint(data3)

output

  1. [{&#39;id&#39;: &#39;x32505&#39;, &#39;obj_count&#39;: 2662, &#39;text&#39;: &#39;pen and ink and watercolour&#39;},
  2. {&#39;id&#39;: &#39;x33202&#39;, &#39;obj_count&#39;: 771, &#39;text&#39;: &#39;watercolour&#39;},
  3. {&#39;id&#39;: &#39;AAT16013&#39;, &#39;obj_count&#39;: 297, &#39;text&#39;: &#39;pencil&#39;},
  4. {&#39;id&#39;: &#39;AAT15012&#39;, &#39;obj_count&#39;: 297, &#39;text&#39;: &#39;ink&#39;}]

huangapple
  • 本文由 发表于 2023年2月9日 01:16:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/75389400.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定