英文:
Why doesn't get_json_object() work to extract a value from JSON stored in a Hive SQL table?
问题
In my hive table T the field A is a string in json format and A stores a value of {"c_e_i":"{\"e_c_f\":1}"}
.
我在我的Hive表T中,字段A是一个JSON格式的字符串,A中存储了一个值为{"c_e_i":"{\"e_c_f\":1}"}
。
I want to get c_e_i.\"e_c_f\"
.
我想要获取 c_e_i.\"e_c_f\"
。
So I used get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"')
but it doesn't work. what should I do?
所以我使用了 get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"')
但它不起作用。我应该怎么做?
I want to get c_e_i.\"e_c_f\"
.
我想要获取 c_e_i.\"e_c_f\"
。
英文:
In my hive table T the field A is a string in json format and A stores a value of {"c_e_i":"{\"e_c_f\":1}"}
.
I want to get c_e_i.\"e_c_f\"
So I used get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"')
but it doesn't work. what should I do?
I want to get c_e_i.\"e_c_f\"
答案1
得分: 1
以下是您要翻译的内容:
由于"{\"e_c_f\":1}"
的值是一个字符串,而不是JSON映射。整个内容不是Map<String:Map<String:String>>
类型,而是Map<String:String>
Map<String:Map<String:String>>
应该如下所示:{"c_e_i":{"e_c_f":1}}
您可以像这样提取字符串值:
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
get_json_object(A, '$.c_e_i') as result
from T
结果:
original_str result
{"c_e_i":"{\"e_c_f\":1}"} {"e_c_f":1}
您看,结果是一个正确的JSON映射 - 斜杠也被移除了,因为它们在提取过程中被解释并转换(在这种情况下被移除),所以您可以从中提取e_c_f(应用get_json_object两次):
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
get_json_object(get_json_object(A, '$.c_e_i'),'$.e_c_f') as result
from T
结果:
original_str result
{"c_e_i":"{\"e_c_f\":1}"} 1
或者从原始JSON中删除所有额外内容,使其成为正确的JSON Map<String:Map<String:String>>
然后提取:
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}') as correct_JSON, --remove slashes and remove doublequotes before { and after }
get_json_object(regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}'), '$.c_e_i.e_c_f') as result
结果:
original_str correct_json result
{"c_e_i":"{\"e_c_f\":1}"} {"c_e_i":{"e_c_f":1}} 1
英文:
Simply because "{\"e_c_f\":1}"
value is a string, not JSON map. The whole thing is not of type Map<String:Map<String:String>>
, it is Map<String:String>
Map<String:Map<String:String>>
should look like this {"c_e_i":{"e_c_f":1}}
And you can extract the string value like this:
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
get_json_object(A, '$.c_e_i') as result
from T
Result:
original_str result
{"c_e_i":"{\"e_c_f\":1}"} {"e_c_f":1}
You see, the result is a correct JSON Map - slashes removed as well because they are interpreted during extract and converted (removed in this case), so you can extract e_c_f from it (apply get_json_object twice):
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
get_json_object(get_json_object(A, '$.c_e_i'),'$.e_c_f') as result
from T
Result:
original_str result
{"c_e_i":"{\"e_c_f\":1}"} 1
Or remove everything extra from original JSON to make it correct JSON Map<String:Map<String:String>>
then extract:
with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A
)
select A as original_str,
regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}') as correct_JSON, --remove slashes and remove doublequotes before { and after }
get_json_object(regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}'), '$.c_e_i.e_c_f') as result
Result:
original_str correct_json result
{"c_e_i":"{\"e_c_f\":1}"} {"c_e_i":{"e_c_f":1}} 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论