通过解析 JSON 列创建一个新列

huangapple go评论78阅读模式
英文:

Creating a new column by parsing json column

问题

我想要为“Type”创建一个新列来存储值 - "webfile","app" 等。我运行了下面的代码:

df_test["new_col"]=df_test['Vals'].apply(lambda x: x['Type'] if 'Type' in x else None)

但是出现了错误:

TypeError: list indices must be integers or slices, not str

有人可以帮忙吗?

英文:

I have dataframe with one json (Vals) column:

                            Identity                                                                                                      Vals
                  2fc9d38d-0fe4-c7be       {"$id":"2","Address":"22.44","Location":{"Code":"TN"},"Asset":false,"Roles":["A"],"Type":"webfile"}
                        abd77d57ac29 {"$id":"3","Address":"40.1","Location":{"Code":"SS"},"Asset":false,"Roles":["Attacker"],"Type":"webfile"}
                           c7be-4a37                  {"$id":"4","AppId":11161,"SaasId":11161,"Name":"Office 365","InstanceId":0,"Type":"app"}
              916a-8051-8fd1721385ae                              {"$id":"3","Address":"213.85","Asset":false,"Roles":["tm"],"Type":"webfile"}
                   8051-8fd1721385ae                     {"$id":"4","Address":"198.137","Asset":false,"Roles":["Contextual"],"Type":"webfile"}
                        8fd1721385ae                             {"$id":"5","AppId":26324,"sId":26324,"Name":"MB","InstanceId":0,"Type":"app"}
                       58a51721385ae                      {"$id":"6","Address":".225.0","Asset":false,"Roles":["Contextual"],"Type":"webfile"}
964fb17e-a352-dbd4-d5b7-374172d811aa                                   {"$id":"2","Name":"AD561-SA","DisplayName":"AD561-SA","Type":"account"}

I want to create a new column for "Type" to hold values - "webfile","app" etc. Ran the code below :

df_test["new_col"]=df_test['Vals'].apply(lambda x: x['Type'] if 'Type' in x else None)

But getting error

TypeError: list indices must be integers or slices, not str

Can someone help ?

答案1

得分: 3

import json

df_test['Type'] = pd.json_normalize(df_test['Vals'].apply(json.loads))['Type']

输出:

>>> df_test[['Identity', 'Type']]
                               Identity     Type
0                    2fc9d38d-0fe4-c7be  webfile
1                          abd77d57ac29  webfile
2                             c7be-4a37      app
3                916a-8051-8fd1721385ae  webfile
4                     8051-8fd1721385ae  webfile
5                          8fd1721385ae      app
6                         58a51721385ae  webfile
7  964fb17e-a352-dbd4-d5b7-374172d811aa  account
英文:

As your Vals column contains JSON string, you have to decode first before extract Type field:

import json

df_test['Type'] = pd.json_normalize(df_test['Vals'].apply(json.loads))['Type']

Output:

>>> df_test[['Identity', 'Type']]
                               Identity     Type
0                    2fc9d38d-0fe4-c7be  webfile
1                          abd77d57ac29  webfile
2                             c7be-4a37      app
3                916a-8051-8fd1721385ae  webfile
4                     8051-8fd1721385ae  webfile
5                          8fd1721385ae      app
6                         58a51721385ae  webfile
7  964fb17e-a352-dbd4-d5b7-374172d811aa  account

huangapple
  • 本文由 发表于 2023年3月7日 22:36:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663339.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定