通过解析 JSON 列创建一个新列

huangapple go评论104阅读模式
英文:

Creating a new column by parsing json column

问题

我想要为“Type”创建一个新列来存储值 - "webfile","app" 等。我运行了下面的代码:

  1. df_test["new_col"]=df_test['Vals'].apply(lambda x: x['Type'] if 'Type' in x else None)

但是出现了错误:

  1. TypeError: list indices must be integers or slices, not str

有人可以帮忙吗?

英文:

I have dataframe with one json (Vals) column:

  1. Identity Vals
  2. 2fc9d38d-0fe4-c7be {"$id":"2","Address":"22.44","Location":{"Code":"TN"},"Asset":false,"Roles":["A"],"Type":"webfile"}
  3. abd77d57ac29 {"$id":"3","Address":"40.1","Location":{"Code":"SS"},"Asset":false,"Roles":["Attacker"],"Type":"webfile"}
  4. c7be-4a37 {"$id":"4","AppId":11161,"SaasId":11161,"Name":"Office 365","InstanceId":0,"Type":"app"}
  5. 916a-8051-8fd1721385ae {"$id":"3","Address":"213.85","Asset":false,"Roles":["tm"],"Type":"webfile"}
  6. 8051-8fd1721385ae {"$id":"4","Address":"198.137","Asset":false,"Roles":["Contextual"],"Type":"webfile"}
  7. 8fd1721385ae {"$id":"5","AppId":26324,"sId":26324,"Name":"MB","InstanceId":0,"Type":"app"}
  8. 58a51721385ae {"$id":"6","Address":".225.0","Asset":false,"Roles":["Contextual"],"Type":"webfile"}
  9. 964fb17e-a352-dbd4-d5b7-374172d811aa {"$id":"2","Name":"AD561-SA","DisplayName":"AD561-SA","Type":"account"}

I want to create a new column for "Type" to hold values - "webfile","app" etc. Ran the code below :

  1. df_test["new_col"]=df_test['Vals'].apply(lambda x: x['Type'] if 'Type' in x else None)

But getting error

  1. TypeError: list indices must be integers or slices, not str

Can someone help ?

答案1

得分: 3

  1. import json
  2. df_test['Type'] = pd.json_normalize(df_test['Vals'].apply(json.loads))['Type']

输出:

  1. >>> df_test[['Identity', 'Type']]
  2. Identity Type
  3. 0 2fc9d38d-0fe4-c7be webfile
  4. 1 abd77d57ac29 webfile
  5. 2 c7be-4a37 app
  6. 3 916a-8051-8fd1721385ae webfile
  7. 4 8051-8fd1721385ae webfile
  8. 5 8fd1721385ae app
  9. 6 58a51721385ae webfile
  10. 7 964fb17e-a352-dbd4-d5b7-374172d811aa account
英文:

As your Vals column contains JSON string, you have to decode first before extract Type field:

  1. import json
  2. df_test['Type'] = pd.json_normalize(df_test['Vals'].apply(json.loads))['Type']

Output:

  1. >>> df_test[['Identity', 'Type']]
  2. Identity Type
  3. 0 2fc9d38d-0fe4-c7be webfile
  4. 1 abd77d57ac29 webfile
  5. 2 c7be-4a37 app
  6. 3 916a-8051-8fd1721385ae webfile
  7. 4 8051-8fd1721385ae webfile
  8. 5 8fd1721385ae app
  9. 6 58a51721385ae webfile
  10. 7 964fb17e-a352-dbd4-d5b7-374172d811aa account

huangapple
  • 本文由 发表于 2023年3月7日 22:36:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663339.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定