如何处理JSON文件,仅包括每个项的4个字段?

huangapple go评论74阅读模式
英文:

How to process a JSON file to include only 4 fields from each item?

问题

这是我的JSON文件的外观:

[
    {
        "id": 13445,
        "uuid": "bf1923c5-a198-409b-851e-c67f7b8a661e",
        "created_at": "2021-07-27 12:41:31.715922",
        "updated_at": "2021-11-10 21:41:41.857982",
        "meta": {
            "osm_id": null,
            "google_maps_place_id": "ChIJO4sAE5d-1EARsPwZh_s6eX4"
        },
        "type": "VILLAGE",
        "name": {
            "en": "Dіbrіvka",
            "ru": "Дибривка",
            "uk": "Дібрівка"
        },
        "public_name": {
            "en": "village Dіbrіvka",
            "ru": "п. Дибривка",
            "uk": "с. Дібрівка"
        },
        "post_code": [
            "09226"
        ],
        "katottg": "UA32120130090047792",
        "koatuu": "3222281602",
        "lng": 30.9903521,
        "lat": 49.88724620000001,
        "parent_id": 1051
    }
]

使用以下代码,我可以读取JSON并将其转换为Python dict

with open('ua_locations.json', 'r', encoding="utf-8") as user_file:
    file_contents = user_file.read()
    json_object = json.loads(file_contents)[0]

print(type(json_object))
print(json_object['id'], json_object['public_name'], json_object['lng'], json_object['lat'])

如何只包含每个元素中的4个字段:id、public name、lng和lat?

英文:

This is what my JSON file looks like:

[
    {
        "id": 13445,
        "uuid": "bf1923c5-a198-409b-851e-c67f7b8a661e",
        "created_at": "2021-07-27 12:41:31.715922",
        "updated_at": "2021-11-10 21:41:41.857982",
        "meta": {
            "osm_id": null,
            "google_maps_place_id": "ChIJO4sAE5d-1EARsPwZh_s6eX4"
        },
        "type": "VILLAGE",
        "name": {
            "en": "Dіbrіvka",
            "ru": "Дибривка",
            "uk": "Дібрівка"
        },
        "public_name": {
            "en": "village Dіbrіvka",
            "ru": "п. Дибривка",
            "uk": "с. Дібрівка"
        },
        "post_code": [
            "09226"
        ],
        "katottg": "UA32120130090047792",
        "koatuu": "3222281602",
        "lng": 30.9903521,
        "lat": 49.88724620000001,
        "parent_id": 1051
    }
]

With the following code I can read JSON and convert it to a Python dict

with open('ua_locations.json', 'r',  encoding="utf-8") as user_file:
    file_contents = user_file.read()
    json_object = json.loads(file_contents)[0]

print(type(json_object))
print(json_object['id']['public_name']['lng']['lat'])

How can I include only 4 fields: id, public name, lng, lat for each element in the list?

答案1

得分: 1

import json

INPUT_FILE = '/Volumes/G-Drive/ua_locations.json'
KEYS = ('id', 'public_name', 'lng', 'lat')

with open(INPUT_FILE) as j:
    for d in [{k: item.get(k) for k in KEYS} for item in json.load(j)]:
        print(d)

Output:

{'id': 13445, 'public_name': {'en': 'village Dіbrіvka', 'ru': 'п. Дибривка', 'uk': 'с. Дібрівка'}, 'lng': 30.9903521, 'lat': 49.88724620000001}
英文:

Create a list (or tuple) of the keys you're interested in. You can then combine a dictionary comprehension within a list comprehension as follows:

import json

INPUT_FILE = '/Volumes/G-Drive/ua_locations.json'
KEYS = ('id', 'public_name', 'lng', 'lat')

with open(INPUT_FILE) as j:
    for d in [{k: item.get(k) for k in KEYS} for item in json.load(j)]:
        print(d)

Output:

{'id': 13445, 'public_name': {'en': 'village Dіbrіvka', 'ru': 'п. Дибривка', 'uk': 'с. Дібрівка'}, 'lng': 30.9903521, 'lat': 49.88724620000001}

答案2

得分: -1

你在示例中链接了键,这不是你预期的结果。

你没有按照相同对象的顺序获取键,你的示例仅在对象是具有以下结构的嵌套字典时有效:

d = {'id': {'public_name': {'lng': {'lat': 42}}}}

JSON 格式如下:

{
    "id": {
        "public_name": {
            "lng": {
                "lat": 42
            }
        }
    }
}

d['id']['public_name']['lng']['lat'] 表示从左到右遍历,获取当前键与当前对象关联的值,并将该值赋给当前对象。结果是 42。

如果你想获取字段,你需要按如下方式查询相同对象:

(d['id'], d['public_name'], d['lng'], d['lat'])

这会获取字段值,但省略了字段名称。

要获取字段名称和值,可以使用以下任何一个推导式:

{k: v for k, v in d.items() if k in {'id', 'public_name', 'lng', 'lat'}}

或者

{k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

第一个保留键在字典中出现的顺序,第二个保证所有项具有相同的字段顺序,但第二个只能在所有字段都存在时才能工作。

你的顶级数据是一个 list,它不支持将字符串用作索引,如果你想将上述操作应用于一个元素,请使用索引获取元素并执行操作,如下所示:

d = json_object[0]
print({k: d[k] for k in ['id', 'public_name', 'lng', 'lat']})

对于第一个对象使用 0,对于第二个使用 1,以此类推。

你可以定义一个函数来重复使用相同的代码:

def filter_fields(d):
    return {k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

最后,要将变换应用于列表中的所有元素,可以使用列表推导式:

import json

with open('ua_locations.json', 'r', encoding='utf8') as f:
    file = f.read()

data = json.loads(file)

[{k: e[k] for k in ['id', 'public_name', 'lng', 'lat']} for e in data]
英文:

You are chaining the keys in your example, it isn't what you expected.

You aren't getting the keys of the same object in order, your example will only work if the object is a nested dictionary with the following structure:

d = {'id': {'public_name': {'lng': {'lat': 42}}}}

In JSON format:

{
    "id": {
        "public_name": {
            "lng": {
                "lat": 42
            }
        }
    }
}

d['id']['public_name']['lng']['lat'] means walk from left to right, get the value associated with the current key from the current object, and assign the value to current object. The result is 42.

You want the fields, you need to query the same object for the fields like so:

(d['id'], d['public_name'], d['lng'], d['lat'])

This gets the field values but omits the field names.

To get the field name and values, use either of the following comprehensions:

{k: v for k, v in d.items() if k in {'id', 'public_name', 'lng', 'lat'}}

Or

{k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

The first preserves the order of the keys as they appear in the dict, the second guarantees all the items have the same order of fields, but the second can only work if all fields are present.

Your top level data is a list, it doesn't support using strings are indexes, if you want to apply the above operations to one element, use indexing to get the element and apply the operations like so:

d = json_object[0]
print({k: d[k] for k in ['id', 'public_name', 'lng', 'lat']})

Use 0 for the first object, 1 for the second, et cetera.

You can define a function to reuse the same code:

def filter_fields(d):
    return {k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

Finally to apply the transformations to all elements of the list, use a list comprehension:

import json

with open('ua_locations.json', 'r',  encoding='utf8') as f:
    file = f.read()

data = json.loads(file)

[{k: e[k] for k in ['id', 'public_name', 'lng', 'lat']} for e in data]

huangapple
  • 本文由 发表于 2023年7月20日 15:40:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76727660.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定