BigQuery:使用REGEX计算所有列中的非空值数量。

huangapple go评论88阅读模式
英文:

BigQuery: Count non-null values across all columns with REGEX

问题

以下是翻译好的代码部分:

SELECT col_name, COUNT(1) non_nulls_count
FROM table t,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r'"(\w+)":(?!null)')) col_name
GROUP BY col_name;

请注意,这段代码用于计算每个列中的非空值数量。

英文:

I have the following query that helps me count how many null values were reported in each column across all columns of a table in BQ:

SELECT col_name, COUNT(1) nulls_count
FROM table t,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r'"(\w+)":null')) col_name
GROUP BY col_name 
;

I need to adjust it so it counts the non-null values. I tried to use negative lookahead but it doesn't seem to work.
My end goal is to indicate wether a certain column reports at least 1 non-null value.

Input example (the table):

BigQuery:使用REGEX计算所有列中的非空值数量。

Output example:

BigQuery:使用REGEX计算所有列中的非空值数量。

column_c is not present since all of its values are nulls.

答案1

得分: 1

以下是你要翻译的代码部分:

select * from (select column, countif(val!= 'null') non_null
from `dataset.table` table1
,unnest(array( 
  select as struct trim(ar[offset(0)], '"') column, trim(ar[offset(1)], '"') val
  from unnest(split(trim(to_json_string(table1), '{}'))) pb,
  unnest([struct(split(pb, ':') as ar)])
)) record
group by column) where non_null!=0
英文:

You can try this, (without REGEX) solution

select * from (select column, countif(val!= 'null') non_null
from `dataset.table` table1
,unnest(array( 
  select as struct trim(ar[offset(0)], '"') column, trim(ar[offset(1)], '"') val
  from unnest(split(trim(to_json_string(table1), '{}'))) pb,
  unnest([struct(split(pb, ':') as ar)])
)) record
group by column) where non_null!=0

output:

BigQuery:使用REGEX计算所有列中的非空值数量。

huangapple
  • 本文由 发表于 2023年2月16日 17:49:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/75470464.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定