2023年7月20日 15:40:25go评论107阅读模式

英文:

How to process a JSON file to include only 4 fields from each item?

问题

这是我的JSON文件的外观：

[
    {
        "id": 13445,
        "uuid": "bf1923c5-a198-409b-851e-c67f7b8a661e",
        "created_at": "2021-07-27 12:41:31.715922",
        "updated_at": "2021-11-10 21:41:41.857982",
        "meta": {
            "osm_id": null,
            "google_maps_place_id": "ChIJO4sAE5d-1EARsPwZh_s6eX4"
        },
        "type": "VILLAGE",
        "name": {
            "en": "Dіbrіvka",
            "ru": "Дибривка",
            "uk": "Дібрівка"
        },
        "public_name": {
            "en": "village Dіbrіvka",
            "ru": "п. Дибривка",
            "uk": "с. Дібрівка"
        },
        "post_code": [
            "09226"
        ],
        "katottg": "UA32120130090047792",
        "koatuu": "3222281602",
        "lng": 30.9903521,
        "lat": 49.88724620000001,
        "parent_id": 1051
    }
]

使用以下代码，我可以读取JSON并将其转换为Python dict

with open('ua_locations.json', 'r', encoding="utf-8") as user_file:
    file_contents = user_file.read()
    json_object = json.loads(file_contents)[0]
print(type(json_object))
print(json_object['id'], json_object['public_name'], json_object['lng'], json_object['lat'])

如何只包含每个元素中的4个字段：id、public name、lng和lat？

英文:

This is what my JSON file looks like:

[
    {
        &quot;id&quot;: 13445,
        &quot;uuid&quot;: &quot;bf1923c5-a198-409b-851e-c67f7b8a661e&quot;,
        &quot;created_at&quot;: &quot;2021-07-27 12:41:31.715922&quot;,
        &quot;updated_at&quot;: &quot;2021-11-10 21:41:41.857982&quot;,
        &quot;meta&quot;: {
            &quot;osm_id&quot;: null,
            &quot;google_maps_place_id&quot;: &quot;ChIJO4sAE5d-1EARsPwZh_s6eX4&quot;
        },
        &quot;type&quot;: &quot;VILLAGE&quot;,
        &quot;name&quot;: {
            &quot;en&quot;: &quot;Dіbrіvka&quot;,
            &quot;ru&quot;: &quot;Дибривка&quot;,
            &quot;uk&quot;: &quot;Дібрівка&quot;
        },
        &quot;public_name&quot;: {
            &quot;en&quot;: &quot;village Dіbrіvka&quot;,
            &quot;ru&quot;: &quot;п. Дибривка&quot;,
            &quot;uk&quot;: &quot;с. Дібрівка&quot;
        },
        &quot;post_code&quot;: [
            &quot;09226&quot;
        ],
        &quot;katottg&quot;: &quot;UA32120130090047792&quot;,
        &quot;koatuu&quot;: &quot;3222281602&quot;,
        &quot;lng&quot;: 30.9903521,
        &quot;lat&quot;: 49.88724620000001,
        &quot;parent_id&quot;: 1051
    }
]

With the following code I can read JSON and convert it to a Python dict

with open(&#39;ua_locations.json&#39;, &#39;r&#39;,  encoding=&quot;utf-8&quot;) as user_file:
    file_contents = user_file.read()
    json_object = json.loads(file_contents)[0]
print(type(json_object))
print(json_object[&#39;id&#39;][&#39;public_name&#39;][&#39;lng&#39;][&#39;lat&#39;])

How can I include only 4 fields: id, public name, lng, lat for each element in the list?

答案1

得分: 1

import json
INPUT_FILE = '/Volumes/G-Drive/ua_locations.json'
KEYS = ('id', 'public_name', 'lng', 'lat')
with open(INPUT_FILE) as j:
    for d in [{k: item.get(k) for k in KEYS} for item in json.load(j)]:
        print(d)

Output:

{'id': 13445, 'public_name': {'en': 'village Dіbrіvka', 'ru': 'п. Дибривка', 'uk': 'с. Дібрівка'}, 'lng': 30.9903521, 'lat': 49.88724620000001}

英文:

Create a list (or tuple) of the keys you're interested in. You can then combine a dictionary comprehension within a list comprehension as follows:

import json
INPUT_FILE = &#39;/Volumes/G-Drive/ua_locations.json&#39;
KEYS = (&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;)
with open(INPUT_FILE) as j:
    for d in [{k: item.get(k) for k in KEYS} for item in json.load(j)]:
        print(d)

Output:

{&#39;id&#39;: 13445, &#39;public_name&#39;: {&#39;en&#39;: &#39;village Dіbrіvka&#39;, &#39;ru&#39;: &#39;п. Дибривка&#39;, &#39;uk&#39;: &#39;с. Дібрівка&#39;}, &#39;lng&#39;: 30.9903521, &#39;lat&#39;: 49.88724620000001}

答案2

得分: -1

你在示例中链接了键，这不是你预期的结果。

你没有按照相同对象的顺序获取键，你的示例仅在对象是具有以下结构的嵌套字典时有效：

d = {'id': {'public_name': {'lng': {'lat': 42}}}}

JSON 格式如下：

{
    "id": {
        "public_name": {
            "lng": {
                "lat": 42
            }
        }
    }
}

d['id']['public_name']['lng']['lat'] 表示从左到右遍历，获取当前键与当前对象关联的值，并将该值赋给当前对象。结果是 42。

如果你想获取字段，你需要按如下方式查询相同对象：

(d['id'], d['public_name'], d['lng'], d['lat'])

这会获取字段值，但省略了字段名称。

要获取字段名称和值，可以使用以下任何一个推导式：

{k: v for k, v in d.items() if k in {'id', 'public_name', 'lng', 'lat'}}

或者

{k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

第一个保留键在字典中出现的顺序，第二个保证所有项具有相同的字段顺序，但第二个只能在所有字段都存在时才能工作。

你的顶级数据是一个 list，它不支持将字符串用作索引，如果你想将上述操作应用于一个元素，请使用索引获取元素并执行操作，如下所示：

d = json_object[0]
print({k: d[k] for k in ['id', 'public_name', 'lng', 'lat']})

对于第一个对象使用 0，对于第二个使用 1，以此类推。

你可以定义一个函数来重复使用相同的代码：

def filter_fields(d):
    return {k: d[k] for k in ['id', 'public_name', 'lng', 'lat']}

最后，要将变换应用于列表中的所有元素，可以使用列表推导式：

import json
with open('ua_locations.json', 'r', encoding='utf8') as f:
    file = f.read()
data = json.loads(file)
[{k: e[k] for k in ['id', 'public_name', 'lng', 'lat']} for e in data]

英文:

You are chaining the keys in your example, it isn't what you expected.

You aren't getting the keys of the same object in order, your example will only work if the object is a nested dictionary with the following structure:

d = {&#39;id&#39;: {&#39;public_name&#39;: {&#39;lng&#39;: {&#39;lat&#39;: 42}}}}

In JSON format:

{
    &quot;id&quot;: {
        &quot;public_name&quot;: {
            &quot;lng&quot;: {
                &quot;lat&quot;: 42
            }
        }
    }
}

d['id']['public_name']['lng']['lat'] means walk from left to right, get the value associated with the current key from the current object, and assign the value to current object. The result is 42.

You want the fields, you need to query the same object for the fields like so:

(d[&#39;id&#39;], d[&#39;public_name&#39;], d[&#39;lng&#39;], d[&#39;lat&#39;])

This gets the field values but omits the field names.

To get the field name and values, use either of the following comprehensions:

{k: v for k, v in d.items() if k in {&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;}}

{k: d[k] for k in [&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;]}

The first preserves the order of the keys as they appear in the dict, the second guarantees all the items have the same order of fields, but the second can only work if all fields are present.

Your top level data is a list, it doesn't support using strings are indexes, if you want to apply the above operations to one element, use indexing to get the element and apply the operations like so:

d = json_object[0]
print({k: d[k] for k in [&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;]})

Use 0 for the first object, 1 for the second, et cetera.

You can define a function to reuse the same code:

def filter_fields(d):
    return {k: d[k] for k in [&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;]}

Finally to apply the transformations to all elements of the list, use a list comprehension:

import json
with open(&#39;ua_locations.json&#39;, &#39;r&#39;,  encoding=&#39;utf8&#39;) as f:
    file = f.read()
data = json.loads(file)
[{k: e[k] for k in [&#39;id&#39;, &#39;public_name&#39;, &#39;lng&#39;, &#39;lat&#39;]} for e in data]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何处理JSON文件，仅包括每个项的4个字段？

问题

答案1

答案2

Why when split a string into a list of substrings, without removing the separators, parts of this original string are lost in the splitting process?

如何在使用输入框时禁用键盘尝试

“django-STATICFILES_DIRS not collecting” would be: “django的STATICFILES_DIRS未收集到”

如何组合JSON代码以显示评论数

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。