get_json_object()无法从存储在Hive SQL表中的JSON中提取值的原因是什么?

huangapple go评论74阅读模式
英文:

Why doesn't get_json_object() work to extract a value from JSON stored in a Hive SQL table?

问题

In my hive table T the field A is a string in json format and A stores a value of {"c_e_i":"{\"e_c_f\":1}"}.

我在我的Hive表T中,字段A是一个JSON格式的字符串,A中存储了一个值为{"c_e_i":"{\"e_c_f\":1}"}

I want to get c_e_i.\"e_c_f\".

我想要获取 c_e_i.\"e_c_f\"

So I used get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"') but it doesn't work. what should I do?

所以我使用了 get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"') 但它不起作用。我应该怎么做?

I want to get c_e_i.\"e_c_f\".

我想要获取 c_e_i.\"e_c_f\"

英文:

In my hive table T the field A is a string in json format and A stores a value of {"c_e_i":"{\"e_c_f\":1}"}.

I want to get c_e_i.\"e_c_f\"

So I used get_json_object(T.A, '$.c_e_i.\\"e_c_f\\"') but it doesn't work. what should I do?

I want to get c_e_i.\"e_c_f\"

答案1

得分: 1

以下是您要翻译的内容:

由于"{\"e_c_f\":1}"的值是一个字符串,而不是JSON映射。整个内容不是Map<String:Map<String:String>>类型,而是Map<String:String>

Map<String:Map<String:String>>应该如下所示:{"c_e_i":{"e_c_f":1}}

您可以像这样提取字符串值:

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
get_json_object(A, '$.c_e_i') as result
from T

结果:

original_str	              result	

{"c_e_i":"{\"e_c_f\":1}"}     {"e_c_f":1}

您看,结果是一个正确的JSON映射 - 斜杠也被移除了,因为它们在提取过程中被解释并转换(在这种情况下被移除),所以您可以从中提取e_c_f(应用get_json_object两次):

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
get_json_object(get_json_object(A, '$.c_e_i'),'$.e_c_f') as result
from T

结果:

original_str	            result	
{"c_e_i":"{\"e_c_f\":1}"}   1

或者从原始JSON中删除所有额外内容,使其成为正确的JSON Map<String:Map<String:String>> 然后提取:

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}') as correct_JSON, --remove slashes and remove doublequotes before { and after }  
get_json_object(regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}'), '$.c_e_i.e_c_f') as result

结果:

original_str	                correct_json	       result	
{"c_e_i":"{\"e_c_f\":1}"}      {"c_e_i":{"e_c_f":1}}    1
英文:

Simply because "{\"e_c_f\":1}" value is a string, not JSON map. The whole thing is not of type Map<String:Map<String:String>>, it is Map<String:String>

Map<String:Map<String:String>> should look like this {"c_e_i":{"e_c_f":1}}

And you can extract the string value like this:

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
get_json_object(A, '$.c_e_i') as result
from T

Result:

original_str	              result	

{"c_e_i":"{\"e_c_f\":1}"}     {"e_c_f":1}

You see, the result is a correct JSON Map - slashes removed as well because they are interpreted during extract and converted (removed in this case), so you can extract e_c_f from it (apply get_json_object twice):

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
get_json_object(get_json_object(A, '$.c_e_i'),'$.e_c_f') as result
from T

Result:

original_str	            result	
{"c_e_i":"{\"e_c_f\":1}"}   1

Or remove everything extra from original JSON to make it correct JSON Map<String:Map<String:String>> then extract:

with T as (
select '{"c_e_i":"{\\\"e_c_f\\\":1}"}' as A 
)

select A as original_str, 
regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}') as correct_JSON, --remove slashes and remove doublequotes before { and after }  
get_json_object(regexp_replace(regexp_replace(regexp_replace(A, '\\\\"','"'), '"\\{','\\{'),'"\\}','\\}'), '$.c_e_i.e_c_f') as result

Result:

original_str	                correct_json	       result	
{"c_e_i":"{\"e_c_f\":1}"}      {"c_e_i":{"e_c_f":1}}    1

huangapple
  • 本文由 发表于 2023年5月25日 21:28:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76332838.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定