2023年6月19日 23:37:59go评论103阅读模式

英文:

Appending multiple values of a json into a Single Pandas column using python

问题

I have trouble in getting values from a Json and storing in a Dataframe.

My Json looks like

{
  "issues": [
    {
      "expand": "operations",
      "id": "1",
      "fields": {
        "customfield_100": [
          {
            "self": "https://url1001",
            "value": "Mobile",
            "id": "1001",
            "disabled": "false"
          }
        ]
      }
    },
    {
      "expand": "operations",
      "id": "2",
      "fields": {
        "customfield_100": [
          {
            "self": "https://url1002",
            "value": "Desktop",
            "id": "1002",
            "disabled": false
          },
          {
            "self": "https://url1001",
            "value": "Mobile",
            "id": "1001",
            "disabled": false
          }
        ]
      }
    },
    {
      "expand": "operations",
      "id": "3",
      "fields": {
        "customfield_100": [
          {
            "self": "https://url1003",
            "value": "ios",
            "id": "1002",
            "disabled": false
          }
        ]
      }
    },
    {
      "expand": "operations",
      "id": "4",
      "fields": {
        "customfield_100": [
          {
            "self": "https://url1002",
            "value": "Desktop",
            "id": "1002",
            "disabled": false
          },
          {
            "self": "https://url1001",
            "value": "Mobile",
            "id": "1001",
            "disabled": false
          },
          {
            "self": "https://url1003",
            "value": "ios",
            "id": "1003",
            "disabled": false
          }
        ]
      }
    }
  ]
}

Below is my part of the code

df2=pd.dataframe()
d=pd.json_normalize(json.loads(df1['customfield_100'].to_json(orient='record')))
filtered_component=[]
for index in range(len(issues.id)):
     if((pd.json_normalize(df1['customfield_100'][index])).size>0):
         filtered_component.append(d[0]['value'])
     else:
         filtered_component.append('No Component')
df2['Component']=filterd_component

When I list df2['Component'], I am getting the below output

'Mobile'
'Desktop'
'ios'
'Desktop'

I would like my output to be like (when I list df2[components]), i.e., if more than a single value for customfield_100, then I would like those to be separated by ";".

I am not sure how the loop/code should be written for it.

'Mobile'
'Desktop';'Mobile'
'ios'
'Desktop';'Mobile';'ios'

英文:

I have trouble in getting values from a Json and storing in a Dataframe.
My Json looks like

{
&quot;issues&quot;: [
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;1&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: &quot;false&quot;
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;2&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1002&quot;,
&quot;value&quot;: &quot;Desktop&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: false
},
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: false
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;3&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1003&quot;,
&quot;value&quot;: &quot;ios&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: false
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;4&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1002&quot;,
&quot;value&quot;: &quot;Desktop&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: false
},
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: false
},
{
&quot;self&quot;: &quot;https://url1003&quot;,
&quot;value&quot;: &quot;ios&quot;,
&quot;id&quot;: &quot;1003&quot;,
&quot;disabled&quot;: false
}
]
}
}
]
}

Below is my part of the code

        df2=pd.dataframe()
d=pd.json_normalize(json.loads(df1[&#39;customfield_100&#39;].to_json(orient=&#39;record&#39;)))
filtered_component=[]
for index in range(len(issues.id)):
if((pd.json_normalize(df1[&#39;customfield_100&#39;][index])).size&gt;0):
filtered_component.append(d[0][index][&#39;value&#39;]
else:
filtered_component.append(&#39;No Component&#39;)
df2[&#39;Component&#39;]=filterd_component

When i list df2['Component'], i am getting the below output

&#39;Mobile&#39;
&#39;Desktop&#39;
&#39;ios&#39;
&#39;Desktop&#39;

I would like my output to be like(when i list df2[components]) i.e if more than single value for customfield_100 then i would like those to be separated by ;. I am not sure how the loop/code should be written for it

&#39;Mobile&#39;
&#39;Desktop&#39;;&#39;Mobile&#39;
&#39;ios&#39;
&#39;Desktop&#39;;&#39;Mobile&#39;;&#39;ios&#39;

答案1

得分: 1

以下是翻译好的部分：

Another possible solution:
df = pd.json_normalize(
data, 
record_path=['issues', 'fields', 'customfield_100'], 
meta=[['issues', 'id']])
df.groupby('issues.id')['value'].agg(';'.join)
where
data = {
"issues": [
{
"expand": "operations",
"id": "1",
"fields": {
"customfield_100": [
{
"self": "https://url1001",
"value": "Mobile",
"id": "1001",
"disabled": False
}
]
}
},
{
"expand": "operations",
"id": "2",
"fields": {
"customfield_100": [
{
"self": "https://url1002",
"value": "Desktop",
"id": "1002",
"disabled": False
},
{
"self": "https://url1001",
"value": "Mobile",
"id": "1001",
"disabled": False
}
]
}
},
{
"expand": "operations",
"id": "3",
"fields": {
"customfield_100": [
{
"self": "https://url1003",
"value": "ios",
"id": "1002",
"disabled": False
}
]
}
},
{
"expand": "operations",
"id": "4",
"fields": {
"customfield_100": [
{
"self": "https://url1002",
"value": "Desktop",
"id": "1002",
"disabled": False
},
{
"self": "https://url1001",
"value": "Mobile",
"id": "1001",
"disabled": False
},
{
"self": "https://url1003",
"value": "ios",
"id": "1003",
"disabled": False
}
]
}
}
]
}
Output:
issues.id
1                Mobile
2        Desktop;Mobile
3                   ios
4    Desktop;Mobile;ios
Name: value, dtype: object

希望这对你有帮助。

英文:

Another possible solution:

df = pd.json_normalize(
data, 
record_path=[&#39;issues&#39;, &#39;fields&#39;, &#39;customfield_100&#39;], 
meta=[[&#39;issues&#39;, &#39;id&#39;]])
df.groupby(&#39;issues.id&#39;)[&#39;value&#39;].agg(&#39;;&#39;.join)

where

data = {
&quot;issues&quot;: [
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;1&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: False
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;2&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1002&quot;,
&quot;value&quot;: &quot;Desktop&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: False
},
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: False
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;3&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1003&quot;,
&quot;value&quot;: &quot;ios&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: False
}
]
}
},
{
&quot;expand&quot;: &quot;operations&quot;,
&quot;id&quot;: &quot;4&quot;,
&quot;fields&quot;: {
&quot;customfield_100&quot;: [
{
&quot;self&quot;: &quot;https://url1002&quot;,
&quot;value&quot;: &quot;Desktop&quot;,
&quot;id&quot;: &quot;1002&quot;,
&quot;disabled&quot;: False
},
{
&quot;self&quot;: &quot;https://url1001&quot;,
&quot;value&quot;: &quot;Mobile&quot;,
&quot;id&quot;: &quot;1001&quot;,
&quot;disabled&quot;: False
},
{
&quot;self&quot;: &quot;https://url1003&quot;,
&quot;value&quot;: &quot;ios&quot;,
&quot;id&quot;: &quot;1003&quot;,
&quot;disabled&quot;: False
}
]
}
}
]
}

Output:

issues.id
1                Mobile
2        Desktop;Mobile
3                   ios
4    Desktop;Mobile;ios
Name: value, dtype: object

答案2

得分: 0

如果 `data` 包含了您解析的 Json 数据，则可以执行以下操作：
```py
all_data = []
for i in data['issues']:
    for k, v in i['fields'].items():
        for vv in v:
            all_data.append({'main_id': i['id'], 'field_id': k, **vv})
df = pd.DataFrame(all_data)
print(df)

这将打印：

  main_id         field_id             self    value    id disabled
0       1  customfield_100  https://url1001   Mobile  1001    false
1       2  customfield_100  https://url1002  Desktop  1002    False
2       2  customfield_100  https://url1001   Mobile  1001    False
3       3  customfield_100  https://url1003      ios  1002    False
4       4  customfield_100  https://url1002  Desktop  1002    False
5       4  customfield_100  https://url1001   Mobile  1001    False
6       4  customfield_100  https://url1003      ios  1003    False

然后您可以按 main_id 进行分组，例如：

df = df.groupby('main_id')['value'].agg(';'.join)
print(df)

这将打印：

main_id
1                Mobile
2        Desktop;Mobile
3                   ios
4    Desktop;Mobile;ios
Name: value, dtype: object


<details>
<summary>英文:</summary>
If `data` contains your parsed Json data then you can do:
```py
all_data = []
for i in data[&#39;issues&#39;]:
for k, v in i[&#39;fields&#39;].items():
for vv in v:
all_data.append({&#39;main_id&#39;: i[&#39;id&#39;], &#39;field_id&#39;: k, **vv})
df = pd.DataFrame(all_data)
print(df)

This prints:

  main_id         field_id             self    value    id disabled
0       1  customfield_100  https://url1001   Mobile  1001    false
1       2  customfield_100  https://url1002  Desktop  1002    False
2       2  customfield_100  https://url1001   Mobile  1001    False
3       3  customfield_100  https://url1003      ios  1002    False
4       4  customfield_100  https://url1002  Desktop  1002    False
5       4  customfield_100  https://url1001   Mobile  1001    False
6       4  customfield_100  https://url1003      ios  1003    False

Then you can group by main_id, e.g.:

df = df.groupby(&#39;main_id&#39;)[&#39;value&#39;].agg(&#39;;&#39;.join)
print(df)

This prints:

main_id
1                Mobile
2        Desktop;Mobile
3                   ios
4    Desktop;Mobile;ios
Name: value, dtype: object

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将多个 JSON 值附加到单个 Pandas 列中使用 Python

问题

答案1

答案2

Pandas按季度和公司统计员工人数

Python在循环中增加参数数量

PyMySQL – 如何在’INSERT’语句中使用通配符？

Python: for循环无原因停止

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。