创建DataFrame,使用嵌套字典和列表。

huangapple go评论103阅读模式
英文:

Python Create dataframe from nested dict with lists

问题

App id stages requestCpu requestMemory
appName 123 dev 1000 1024
appName 123 staging 3200 1024
  1. test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2"...}

以前我使用了类似这样的方法:

  1. df = pd.DataFrame.from_dict(test_data, orient='index')
  2. df = pd.concat([df.drop(['stages'], axis=1), (df['stages'].apply(pd.Series))], axis=1)
  3. df.index.name = "App"

然而,这无法拆分列表部分,而且现在各个阶段都在列中,不是我想要的样子。

英文:

I am trying to create a dataframe / csv that looks like this

App id stages requestCpu requestMemory
appName 123 dev 1000 1024
appName 123 staging 3200 1024

The dict data looks like this and includes quite a lot of apps, however all the data inside the apps looks the same with the dict layout:

  1. test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2"...}

I used something like this before:

  1. df = pd.DataFrame.from_dict(test_data, orient='index')
  2. df = pd.concat([df.drop(['stages'], axis=1), (df['stages'].apply(pd.Series))], axis=1)
  3. df.index.name = "App"

However this wasn't able to split up the list part and also the stages were now in columns so not how i wanted it to look..

Any help much appreciated, thanks

答案1

得分: 0

以下是翻译好的部分:

  1. Easiest solution would be to iterate the rows prior to loading it with pandas:
  2. import pandas as pd
  3. test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}, "appName2": {"id": "456", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}
  4. rows = []
  5. for app, app_data in test_data.items():
  6. for stage, stage_data in app_data["stages"].items():
  7. row = {
  8. "App": app,
  9. "id": app_data["id"],
  10. "stages": stage
  11. }
  12. for metric in stage_data:
  13. metric_name, metric_value = list(metric.items())[0]
  14. row[metric_name] = metric_value
  15. rows.append(row)
  16. df = pd.json_normalize(rows)
  17. # Reorder columns
  18. df = df[["App", "id", "stages", "request.cpu", "request.memory"]]

Output:

App id stages request.cpu request.memory
0 appName 123 dev 1000 1024
1 appName 123 staging 3200 1024
2 appName2 456 dev 1000 1024
3 appName2 456 staging 3200 1024
  1. <details>
  2. <summary>英文:</summary>
  3. Easiest solution would be to iterate the rows prior to loading it with pandas:
  4. import pandas as pd
  5. test_data = {&quot;appName&quot;: {&quot;id&quot;: &quot;123&quot;, &quot;stages&quot;: {&quot;dev&quot;: [{&quot;request.cpu&quot;: 1000}, {&quot;request.memory&quot;: 1024}], &quot;staging&quot;: [{&quot;request.cpu&quot;: 3200}, {&quot;request.memory&quot;: 1024}]}}, &quot;appName2&quot;: {&quot;id&quot;: &quot;456&quot;, &quot;stages&quot;: {&quot;dev&quot;: [{&quot;request.cpu&quot;: 1000}, {&quot;request.memory&quot;: 1024}], &quot;staging&quot;: [{&quot;request.cpu&quot;: 3200}, {&quot;request.memory&quot;: 1024}]}}}
  6. rows = []
  7. for app, app_data in test_data.items():
  8. for stage, stage_data in app_data[&quot;stages&quot;].items():
  9. row = {
  10. &quot;App&quot;: app,
  11. &quot;id&quot;: app_data[&quot;id&quot;],
  12. &quot;stages&quot;: stage
  13. }
  14. for metric in stage_data:
  15. metric_name, metric_value = list(metric.items())[0]
  16. row[metric_name] = metric_value
  17. rows.append(row)
  18. df = pd.json_normalize(rows)
  19. # Reorder columns
  20. df = df[[&quot;App&quot;, &quot;id&quot;, &quot;stages&quot;, &quot;request.cpu&quot;, &quot;request.memory&quot;]]
  21. Output:
  22. | | App | id | stages | request.cpu | request.memory |
  23. |---:|:---------|-----:|:---------|--------------:|-----------------:|
  24. | 0 | appName | 123 | dev | 1000 | 1024 |
  25. | 1 | appName | 123 | staging | 3200 | 1024 |
  26. | 2 | appName2 | 456 | dev | 1000 | 1024 |
  27. | 3 | appName2 | 456 | staging | 3200 | 1024 |
  28. </details>

huangapple
  • 本文由 发表于 2023年2月18日 20:28:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/75493336.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定