创建DataFrame,使用嵌套字典和列表。

huangapple go评论58阅读模式
英文:

Python Create dataframe from nested dict with lists

问题

App id stages requestCpu requestMemory
appName 123 dev 1000 1024
appName 123 staging 3200 1024
test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2"...}

以前我使用了类似这样的方法:

df = pd.DataFrame.from_dict(test_data, orient='index')
df = pd.concat([df.drop(['stages'], axis=1), (df['stages'].apply(pd.Series))], axis=1)
df.index.name = "App"

然而,这无法拆分列表部分,而且现在各个阶段都在列中,不是我想要的样子。

英文:

I am trying to create a dataframe / csv that looks like this

App id stages requestCpu requestMemory
appName 123 dev 1000 1024
appName 123 staging 3200 1024

The dict data looks like this and includes quite a lot of apps, however all the data inside the apps looks the same with the dict layout:

test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2"...}

I used something like this before:

df = pd.DataFrame.from_dict(test_data, orient='index')
df = pd.concat([df.drop(['stages'], axis=1), (df['stages'].apply(pd.Series))], axis=1)
df.index.name = "App"

However this wasn't able to split up the list part and also the stages were now in columns so not how i wanted it to look..

Any help much appreciated, thanks

答案1

得分: 0

以下是翻译好的部分:

Easiest solution would be to iterate the rows prior to loading it with pandas:

import pandas as pd

test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}, "appName2": {"id": "456", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}

rows = []

for app, app_data in test_data.items():
    for stage, stage_data in app_data["stages"].items():
        row = {
            "App": app,
            "id": app_data["id"],
            "stages": stage
        }
        for metric in stage_data:
            metric_name, metric_value = list(metric.items())[0]
            row[metric_name] = metric_value
        rows.append(row)

df = pd.json_normalize(rows)

# Reorder columns
df = df[["App", "id", "stages", "request.cpu", "request.memory"]]

Output:

App id stages request.cpu request.memory
0 appName 123 dev 1000 1024
1 appName 123 staging 3200 1024
2 appName2 456 dev 1000 1024
3 appName2 456 staging 3200 1024

<details>
<summary>英文:</summary>

Easiest solution would be to iterate the rows prior to loading it with pandas:

    import pandas as pd
    
    test_data = {&quot;appName&quot;: {&quot;id&quot;: &quot;123&quot;, &quot;stages&quot;: {&quot;dev&quot;: [{&quot;request.cpu&quot;: 1000}, {&quot;request.memory&quot;: 1024}], &quot;staging&quot;: [{&quot;request.cpu&quot;: 3200}, {&quot;request.memory&quot;: 1024}]}}, &quot;appName2&quot;: {&quot;id&quot;: &quot;456&quot;, &quot;stages&quot;: {&quot;dev&quot;: [{&quot;request.cpu&quot;: 1000}, {&quot;request.memory&quot;: 1024}], &quot;staging&quot;: [{&quot;request.cpu&quot;: 3200}, {&quot;request.memory&quot;: 1024}]}}}
    
    
    rows = []
    
    for app, app_data in test_data.items():
        for stage, stage_data in app_data[&quot;stages&quot;].items():
            row = {
                &quot;App&quot;: app,
                &quot;id&quot;: app_data[&quot;id&quot;],
                &quot;stages&quot;: stage
            }
            for metric in stage_data:
                metric_name, metric_value = list(metric.items())[0]
                row[metric_name] = metric_value
            rows.append(row)
    
    df = pd.json_normalize(rows)
    
    # Reorder columns 
    df = df[[&quot;App&quot;, &quot;id&quot;, &quot;stages&quot;, &quot;request.cpu&quot;, &quot;request.memory&quot;]]

Output:

|    | App      |   id | stages   |   request.cpu |   request.memory |
|---:|:---------|-----:|:---------|--------------:|-----------------:|
|  0 | appName  |  123 | dev      |          1000 |             1024 |
|  1 | appName  |  123 | staging  |          3200 |             1024 |
|  2 | appName2 |  456 | dev      |          1000 |             1024 |
|  3 | appName2 |  456 | staging  |          3200 |             1024 |

</details>



huangapple
  • 本文由 发表于 2023年2月18日 20:28:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/75493336.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定