从嵌套的JSON创建规范化的数据框。

huangapple go评论187阅读模式
英文:

Create normalized dataframe from a nested json

问题

Sure, here's the translated code:

  1. with open("employee.json") as file:
  2. data = json.load(file)
  3. data_df = pd.json_normalize(data, 'addresses', ['rec_id', 'timestamp', 'edited_timestamp', 'user.id', 'user.name'])
  4. data_df.columns = ['rec_id', 'addresses.address_type', 'addresses.street1', 'addresses.street2', 'addresses.city', 'timestamp', 'edited_timestamp', 'user.id', 'user.name']
  5. display(data_df)
英文:

I am trying to create a dataframe from nested json file but running into trouble.

  1. [
  2. {
  3. "rec_id": "1",
  4. "user": {
  5. "id": "12414",
  6. "name": "Steve"
  7. },
  8. "addresses": [
  9. {
  10. "address_type": "Home",
  11. "street1": "100 Main St",
  12. "street2": null,
  13. "city": "Chicago"
  14. },
  15. {
  16. "address_type": "Work",
  17. "street1": "100 Main St",
  18. "street2": null,
  19. "city": "Chicago"
  20. }
  21. ],
  22. "timestamp": "2023-07-28T20:05:14.859000+00:00",
  23. "edited_timestamp": null
  24. },
  25. {
  26. "rec_id": "2",
  27. "user": {
  28. "id": "214521",
  29. "name": "Tim"
  30. },
  31. "addresses": [
  32. {
  33. "address_type": "Home",
  34. "street1": "100 Main St",
  35. "street2": null,
  36. "city": "Boston"
  37. },
  38. {
  39. "address_type": "Work",
  40. "street1": "100 Main St",
  41. "street2": null,
  42. "city": "Boston"
  43. }
  44. ],
  45. "timestamp": "2023-07-28T20:05:14.859000+00:00",
  46. "edited_timestamp": null
  47. },
  48. {
  49. "rec_id": "3",
  50. "user": {
  51. "id": "12121",
  52. "name": "Jack"
  53. },
  54. "addresses": [
  55. {
  56. "address_type": "Home",
  57. "street1": "100 Main St",
  58. "street2": null,
  59. "city": "Las Vegas"
  60. } ]
  61. "timestamp": "2023-07-28T20:05:14.859000+00:00",
  62. "edited_timestamp": null
  63. }
  64. ]

I tried below:

  1. with open("employee.json") as file:
  2. data = json.load(file)
  3. data_df = pd.json_normalize(data)
  4. data_df.columns.values.tolist()
  5. ['rec_id',
  6. 'addresses',
  7. 'timestamp',
  8. 'edited_timestamp',
  9. 'user.id',
  10. 'user.name']
  11. display(data_df)
  1. rec_id addresses timestamp edited_timestamp user.id user.name
  2. 0 1 [{'address_type': 'Home', 'street1': '100 Main St', 'street2': None, 'city': 'Chicago'}, {'address_type': 'Work', 'street1': '100 Main St', 'street2': None, 'city': 'Chicago'}] 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  3. 1 2 [{'address_type': 'Home', 'street1': '100 Main St', 'street2': None, 'city': 'Boston'}, {'address_type': 'Work', 'street1': '100 Main St', 'street2': None, 'city': 'Boston'}] 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  4. 2 3 [{'address_type': 'Home', 'street1': '100 Main St', 'street2': None, 'city': 'Las Vegas'}] 2023-07-28T20:05:14.859000+00:00 None 12121 Jack

How do I get the output as below -

  1. rec_id addresses.address_type addresses.street1 addresses.street2 addresses.city timestamp edited_timestamp user.id user.name
  2. 0 1 Home 100 Main St None Chicago 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  3. 1 1 Work 100 Main St None Chicago 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  4. 2 2 Home 100 Main St None Boston 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  5. 3 2 Work 100 Main St None Boston 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  6. 4 3 Home 100 Main St None Las Vegas 2023-07-28T20:05:14.859000+00:00 None 12121 Jack

答案1

得分: 1

这是您要翻译的内容:

尝试这个,更多关于 .json_normalize

  1. df = pd.json_normalize(data,
  2. record_path='addresses',
  3. meta=['rec_id', ["user", "id"], ["user", "name"], 'timestamp', 'edited_timestamp'])

输出:

address_type street1 street2 city rec_id user.id user.name timestamp edited_timestamp
0 Home 100 Main St Chicago 1 12414 Steve 2023-07-28T20:05:14.859000+00:00
1 Work 100 Main St Chicago 1 12414 Steve 2023-07-28T20:05:14.859000+00:00
2 Home 100 Main St Boston 2 214521 Tim 2023-07-28T20:05:14.859000+00:00
3 Work 100 Main St Boston 2 214521 Tim 2023-07-28T20:05:14.859000+00:00
4 Home 100 Main St Las Vegas 3 12121 Jack 2023-07-28T20:05:14.859000+00:00
英文:

Try this, more about .json_normalize

  1. df = pd.json_normalize(data,
  2. record_path='addresses',
  3. meta=['rec_id', ["user", "id"], ["user", "name"], 'timestamp', 'edited_timestamp'])

Output:

address_type street1 street2 city rec_id user.id user.name timestamp edited_timestamp
0 Home 100 Main St Chicago 1 12414 Steve 2023-07-28T20:05:14.859000+00:00
1 Work 100 Main St Chicago 1 12414 Steve 2023-07-28T20:05:14.859000+00:00
2 Home 100 Main St Boston 2 214521 Tim 2023-07-28T20:05:14.859000+00:00
3 Work 100 Main St Boston 2 214521 Tim 2023-07-28T20:05:14.859000+00:00
4 Home 100 Main St Las Vegas 3 12121 Jack 2023-07-28T20:05:14.859000+00:00

答案2

得分: 0

尝试:

  1. data_df = pd.json_normalize(
  2. data,
  3. meta=["rec_id", "timestamp", "edited_timestamp", ["user", "id"], ["user", "name"]],
  4. record_path=["addresses"],
  5. record_prefix="addresses."
  6. )
  7. print(data_df)

打印:

  1. addresses.address_type addresses.street1 addresses.street2 addresses.city rec_id timestamp edited_timestamp user.id user.name
  2. 0 Home 100 Main St None Chicago 1 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  3. 1 Work 100 Main St None Chicago 1 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  4. 2 Home 100 Main St None Boston 2 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  5. 3 Work 100 Main St None Boston 2 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  6. 4 Home 100 Main St None Las Vegas 3 2023-07-28T20:05:14.859000+00:00 None 12121 Jack
英文:

Try:

  1. data_df = pd.json_normalize(
  2. data,
  3. meta=["rec_id", "timestamp", "edited_timestamp", ["user", "id"], ["user", "name"]],
  4. record_path=["addresses"],
  5. record_prefix="addresses.",
  6. )
  7. print(data_df)

Prints:

  1. addresses.address_type addresses.street1 addresses.street2 addresses.city rec_id timestamp edited_timestamp user.id user.name
  2. 0 Home 100 Main St None Chicago 1 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  3. 1 Work 100 Main St None Chicago 1 2023-07-28T20:05:14.859000+00:00 None 12414 Steve
  4. 2 Home 100 Main St None Boston 2 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  5. 3 Work 100 Main St None Boston 2 2023-07-28T20:05:14.859000+00:00 None 214521 Tim
  6. 4 Home 100 Main St None Las Vegas 3 2023-07-28T20:05:14.859000+00:00 None 12121 Jack

huangapple
  • 本文由 发表于 2023年8月5日 04:37:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76838992.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定