将多个 JSON 值附加到单个 Pandas 列中使用 Python

huangapple go评论103阅读模式
英文:

Appending multiple values of a json into a Single Pandas column using python

问题

I have trouble in getting values from a Json and storing in a Dataframe.

My Json looks like

  1. {
  2. "issues": [
  3. {
  4. "expand": "operations",
  5. "id": "1",
  6. "fields": {
  7. "customfield_100": [
  8. {
  9. "self": "https://url1001",
  10. "value": "Mobile",
  11. "id": "1001",
  12. "disabled": "false"
  13. }
  14. ]
  15. }
  16. },
  17. {
  18. "expand": "operations",
  19. "id": "2",
  20. "fields": {
  21. "customfield_100": [
  22. {
  23. "self": "https://url1002",
  24. "value": "Desktop",
  25. "id": "1002",
  26. "disabled": false
  27. },
  28. {
  29. "self": "https://url1001",
  30. "value": "Mobile",
  31. "id": "1001",
  32. "disabled": false
  33. }
  34. ]
  35. }
  36. },
  37. {
  38. "expand": "operations",
  39. "id": "3",
  40. "fields": {
  41. "customfield_100": [
  42. {
  43. "self": "https://url1003",
  44. "value": "ios",
  45. "id": "1002",
  46. "disabled": false
  47. }
  48. ]
  49. }
  50. },
  51. {
  52. "expand": "operations",
  53. "id": "4",
  54. "fields": {
  55. "customfield_100": [
  56. {
  57. "self": "https://url1002",
  58. "value": "Desktop",
  59. "id": "1002",
  60. "disabled": false
  61. },
  62. {
  63. "self": "https://url1001",
  64. "value": "Mobile",
  65. "id": "1001",
  66. "disabled": false
  67. },
  68. {
  69. "self": "https://url1003",
  70. "value": "ios",
  71. "id": "1003",
  72. "disabled": false
  73. }
  74. ]
  75. }
  76. }
  77. ]
  78. }

Below is my part of the code

  1. df2=pd.dataframe()
  2. d=pd.json_normalize(json.loads(df1['customfield_100'].to_json(orient='record')))
  3. filtered_component=[]
  4. for index in range(len(issues.id)):
  5. if((pd.json_normalize(df1['customfield_100'][index])).size>0):
  6. filtered_component.append(d[0]['value'])
  7. else:
  8. filtered_component.append('No Component')
  9. df2['Component']=filterd_component

When I list df2['Component'], I am getting the below output

  1. 'Mobile'
  2. 'Desktop'
  3. 'ios'
  4. 'Desktop'

I would like my output to be like (when I list df2[components]), i.e., if more than a single value for customfield_100, then I would like those to be separated by ";".

I am not sure how the loop/code should be written for it.

  1. 'Mobile'
  2. 'Desktop';'Mobile'
  3. 'ios'
  4. 'Desktop';'Mobile';'ios'
英文:

I have trouble in getting values from a Json and storing in a Dataframe.
My Json looks like

  1. {
  2. "issues": [
  3. {
  4. "expand": "operations",
  5. "id": "1",
  6. "fields": {
  7. "customfield_100": [
  8. {
  9. "self": "https://url1001",
  10. "value": "Mobile",
  11. "id": "1001",
  12. "disabled": "false"
  13. }
  14. ]
  15. }
  16. },
  17. {
  18. "expand": "operations",
  19. "id": "2",
  20. "fields": {
  21. "customfield_100": [
  22. {
  23. "self": "https://url1002",
  24. "value": "Desktop",
  25. "id": "1002",
  26. "disabled": false
  27. },
  28. {
  29. "self": "https://url1001",
  30. "value": "Mobile",
  31. "id": "1001",
  32. "disabled": false
  33. }
  34. ]
  35. }
  36. },
  37. {
  38. "expand": "operations",
  39. "id": "3",
  40. "fields": {
  41. "customfield_100": [
  42. {
  43. "self": "https://url1003",
  44. "value": "ios",
  45. "id": "1002",
  46. "disabled": false
  47. }
  48. ]
  49. }
  50. },
  51. {
  52. "expand": "operations",
  53. "id": "4",
  54. "fields": {
  55. "customfield_100": [
  56. {
  57. "self": "https://url1002",
  58. "value": "Desktop",
  59. "id": "1002",
  60. "disabled": false
  61. },
  62. {
  63. "self": "https://url1001",
  64. "value": "Mobile",
  65. "id": "1001",
  66. "disabled": false
  67. },
  68. {
  69. "self": "https://url1003",
  70. "value": "ios",
  71. "id": "1003",
  72. "disabled": false
  73. }
  74. ]
  75. }
  76. }
  77. ]
  78. }

Below is my part of the code

  1. df2=pd.dataframe()
  2. d=pd.json_normalize(json.loads(df1['customfield_100'].to_json(orient='record')))
  3. filtered_component=[]
  4. for index in range(len(issues.id)):
  5. if((pd.json_normalize(df1['customfield_100'][index])).size>0):
  6. filtered_component.append(d[0][index]['value']
  7. else:
  8. filtered_component.append('No Component')
  9. df2['Component']=filterd_component

When i list df2['Component'], i am getting the below output

  1. 'Mobile'
  2. 'Desktop'
  3. 'ios'
  4. 'Desktop'

I would like my output to be like(when i list df2[components]) i.e if more than single value for customfield_100 then i would like those to be separated by ;. I am not sure how the loop/code should be written for it

  1. 'Mobile'
  2. 'Desktop';'Mobile'
  3. 'ios'
  4. 'Desktop';'Mobile';'ios'

答案1

得分: 1

以下是翻译好的部分:

  1. Another possible solution:
  2. df = pd.json_normalize(
  3. data,
  4. record_path=['issues', 'fields', 'customfield_100'],
  5. meta=[['issues', 'id']])
  6. df.groupby('issues.id')['value'].agg(';'.join)
  7. where
  8. data = {
  9. "issues": [
  10. {
  11. "expand": "operations",
  12. "id": "1",
  13. "fields": {
  14. "customfield_100": [
  15. {
  16. "self": "https://url1001",
  17. "value": "Mobile",
  18. "id": "1001",
  19. "disabled": False
  20. }
  21. ]
  22. }
  23. },
  24. {
  25. "expand": "operations",
  26. "id": "2",
  27. "fields": {
  28. "customfield_100": [
  29. {
  30. "self": "https://url1002",
  31. "value": "Desktop",
  32. "id": "1002",
  33. "disabled": False
  34. },
  35. {
  36. "self": "https://url1001",
  37. "value": "Mobile",
  38. "id": "1001",
  39. "disabled": False
  40. }
  41. ]
  42. }
  43. },
  44. {
  45. "expand": "operations",
  46. "id": "3",
  47. "fields": {
  48. "customfield_100": [
  49. {
  50. "self": "https://url1003",
  51. "value": "ios",
  52. "id": "1002",
  53. "disabled": False
  54. }
  55. ]
  56. }
  57. },
  58. {
  59. "expand": "operations",
  60. "id": "4",
  61. "fields": {
  62. "customfield_100": [
  63. {
  64. "self": "https://url1002",
  65. "value": "Desktop",
  66. "id": "1002",
  67. "disabled": False
  68. },
  69. {
  70. "self": "https://url1001",
  71. "value": "Mobile",
  72. "id": "1001",
  73. "disabled": False
  74. },
  75. {
  76. "self": "https://url1003",
  77. "value": "ios",
  78. "id": "1003",
  79. "disabled": False
  80. }
  81. ]
  82. }
  83. }
  84. ]
  85. }
  86. Output:
  87. issues.id
  88. 1 Mobile
  89. 2 Desktop;Mobile
  90. 3 ios
  91. 4 Desktop;Mobile;ios
  92. Name: value, dtype: object

希望这对你有帮助。

英文:

Another possible solution:

  1. df = pd.json_normalize(
  2. data,
  3. record_path=['issues', 'fields', 'customfield_100'],
  4. meta=[['issues', 'id']])
  5. df.groupby('issues.id')['value'].agg(';'.join)

where

  1. data = {
  2. "issues": [
  3. {
  4. "expand": "operations",
  5. "id": "1",
  6. "fields": {
  7. "customfield_100": [
  8. {
  9. "self": "https://url1001",
  10. "value": "Mobile",
  11. "id": "1001",
  12. "disabled": False
  13. }
  14. ]
  15. }
  16. },
  17. {
  18. "expand": "operations",
  19. "id": "2",
  20. "fields": {
  21. "customfield_100": [
  22. {
  23. "self": "https://url1002",
  24. "value": "Desktop",
  25. "id": "1002",
  26. "disabled": False
  27. },
  28. {
  29. "self": "https://url1001",
  30. "value": "Mobile",
  31. "id": "1001",
  32. "disabled": False
  33. }
  34. ]
  35. }
  36. },
  37. {
  38. "expand": "operations",
  39. "id": "3",
  40. "fields": {
  41. "customfield_100": [
  42. {
  43. "self": "https://url1003",
  44. "value": "ios",
  45. "id": "1002",
  46. "disabled": False
  47. }
  48. ]
  49. }
  50. },
  51. {
  52. "expand": "operations",
  53. "id": "4",
  54. "fields": {
  55. "customfield_100": [
  56. {
  57. "self": "https://url1002",
  58. "value": "Desktop",
  59. "id": "1002",
  60. "disabled": False
  61. },
  62. {
  63. "self": "https://url1001",
  64. "value": "Mobile",
  65. "id": "1001",
  66. "disabled": False
  67. },
  68. {
  69. "self": "https://url1003",
  70. "value": "ios",
  71. "id": "1003",
  72. "disabled": False
  73. }
  74. ]
  75. }
  76. }
  77. ]
  78. }

Output:

  1. issues.id
  2. 1 Mobile
  3. 2 Desktop;Mobile
  4. 3 ios
  5. 4 Desktop;Mobile;ios
  6. Name: value, dtype: object

答案2

得分: 0

  1. 如果 `data` 包含了您解析的 Json 数据则可以执行以下操作
  2. ```py
  3. all_data = []
  4. for i in data['issues']:
  5. for k, v in i['fields'].items():
  6. for vv in v:
  7. all_data.append({'main_id': i['id'], 'field_id': k, **vv})
  8. df = pd.DataFrame(all_data)
  9. print(df)

这将打印:

  1. main_id field_id self value id disabled
  2. 0 1 customfield_100 https://url1001 Mobile 1001 false
  3. 1 2 customfield_100 https://url1002 Desktop 1002 False
  4. 2 2 customfield_100 https://url1001 Mobile 1001 False
  5. 3 3 customfield_100 https://url1003 ios 1002 False
  6. 4 4 customfield_100 https://url1002 Desktop 1002 False
  7. 5 4 customfield_100 https://url1001 Mobile 1001 False
  8. 6 4 customfield_100 https://url1003 ios 1003 False

然后您可以按 main_id 进行分组,例如:

  1. df = df.groupby('main_id')['value'].agg(';'.join)
  2. print(df)

这将打印:

  1. main_id
  2. 1 Mobile
  3. 2 Desktop;Mobile
  4. 3 ios
  5. 4 Desktop;Mobile;ios
  6. Name: value, dtype: object
  1. <details>
  2. <summary>英文:</summary>
  3. If `data` contains your parsed Json data then you can do:
  4. ```py
  5. all_data = []
  6. for i in data[&#39;issues&#39;]:
  7. for k, v in i[&#39;fields&#39;].items():
  8. for vv in v:
  9. all_data.append({&#39;main_id&#39;: i[&#39;id&#39;], &#39;field_id&#39;: k, **vv})
  10. df = pd.DataFrame(all_data)
  11. print(df)

This prints:

  1. main_id field_id self value id disabled
  2. 0 1 customfield_100 https://url1001 Mobile 1001 false
  3. 1 2 customfield_100 https://url1002 Desktop 1002 False
  4. 2 2 customfield_100 https://url1001 Mobile 1001 False
  5. 3 3 customfield_100 https://url1003 ios 1002 False
  6. 4 4 customfield_100 https://url1002 Desktop 1002 False
  7. 5 4 customfield_100 https://url1001 Mobile 1001 False
  8. 6 4 customfield_100 https://url1003 ios 1003 False

Then you can group by main_id, e.g.:

  1. df = df.groupby(&#39;main_id&#39;)[&#39;value&#39;].agg(&#39;;&#39;.join)
  2. print(df)

This prints:

  1. main_id
  2. 1 Mobile
  3. 2 Desktop;Mobile
  4. 3 ios
  5. 4 Desktop;Mobile;ios
  6. Name: value, dtype: object

huangapple
  • 本文由 发表于 2023年6月19日 23:37:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76508142.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定