Unnest Json Array json_extract_scalar

huangapple go评论127阅读模式
英文:

Unnest Json Array json_extract_scalar

问题

我有一个表格 employee,其中有一个列 employee_detail,其中的数据如下:

  1. {'emp': 'myself', 'dept': 'IT', 'empid': '12345', 'empdetail': {'JoiningDetail': {'JoinedGeneratedTimestamp': 1683295496534, 'position': 12115}, 'xyz': None, 'abc': None}, 'title': 'SrDevelop'}

我想在从 Athena 查询时获取 JoinedGeneratedTimestamp 和 position 数据。

以下是我尝试的查询:

  1. with data as(
  2. select employee_detail,
  3. json_extract_scalar((replace(employee_detail, ''', '"')),
  4. '$.empdetail.JoiningDetail.JoinedGeneratedTimestamp') as JoinedGeneratedTimestamp
  5. from "employee" limit 5 )
  6. select employee_detail, JoinedGeneratedTimestamp
  7. from data

在 JoinedGeneratedTimestamp 字段中获取到空白。

英文:

I have a table employee , column employee_detail which have data like:

  1. {'emp': 'myself', 'dept': 'IT', 'empid': '12345', 'empdetail': {'JoiningDetail': {'JoinedGeneratedTimestamp': 1683295496534, 'position': 12115}, 'xyz': None, 'abc': None}, 'title': 'SrDevelop'}

where I want to get JoinedGeneratedTimestamp and postition data while querying from athena

Below query I tried :

  1. with data as(
  2. select employee_detail,
  3. json_extract_scalar((replace(employee_detail, '''', '"')),
  4. '$.empdetail.JoiningDetail.JoinedGeneratedTimestamp') as JoinedGeneratedTimestamp
  5. from "employee" limit 5 )
  6. select employee_detail, JoinedGeneratedTimestamp
  7. from data

Getting blank for JoinedGeneratedTimestamp field.

答案1

得分: 0

json_extract_scalar -

  1. "json_extract_scalar" -
  2. ```JSON
  3. {
  4. "join_generated_timestamp": 1683295496534,
  5. "position": 12115
  6. }

json_extract -

  1. "json_extract" -
  2. ```JSON
  3. {
  4. "join_generated_timestamp": 1683295496534,
  5. "position": 12115
  6. }
英文:

Are you sure the "json" like field is in that format? also, maybe the JoinedGeneratedTimestamp sometimes holds non- scalar values?

the json_exctract_scalar can only extract boolean, number and string
if you wish to extract arrays, maps or structs (more complex types) use the regular json_extract

For both cases, based only on the data example you provided, it seems to work just fine

json_extract_scalar -

  1. WITH dataset AS (
  2. SELECT '{''emp'': ''myself'', ''dept'': ''IT'', ''empid'': ''12345'', ''empdetail'': {''JoiningDetail'': {''JoinedGeneratedTimestamp'': 1683295496534, ''position'': 12115}, ''xyz'': None, ''abc'': None}, ''title'': ''SrDevelop''}'
  3. AS employee_detail
  4. )
  5. select
  6. json_extract_scalar(replace(employee_detail, '''', '"'), '$.empdetail.JoiningDetail.JoinedGeneratedTimestamp') as join_generated_timestamp,
  7. json_extract_scalar(replace(employee_detail, '''', '"'), '$.empdetail.JoiningDetail.position') as position
  8. from dataset

output -

  1. join_generated_timestamp, position
  2. 1683295496534, 12115

json_extract -

  1. WITH dataset AS (
  2. SELECT '{''emp'': ''myself'', ''dept'': ''IT'', ''empid'': ''12345'', ''empdetail'': {''JoiningDetail'': {''JoinedGeneratedTimestamp'': 1683295496534, ''position'': 12115}, ''xyz'': None, ''abc'': None}, ''title'': ''SrDevelop''}'
  3. AS employee_detail
  4. )
  5. select
  6. json_extract(replace(employee_detail, '''', '"'), '$.empdetail.JoiningDetail.JoinedGeneratedTimestamp') as join_generated_timestamp,
  7. json_extract(replace(employee_detail, '''', '"'), '$.empdetail.JoiningDetail.position') as position
  8. from dataset

output -

  1. join_generated_timestamp, position
  2. 1683295496534, 12115

huangapple
  • 本文由 发表于 2023年6月22日 15:18:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76529416.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定