Bigquery,SQL,如何计算一行中唯一值的数量

huangapple go评论62阅读模式
英文:

Bigquery,SQL, how to calculate number of unique value over a row

问题

让我们假设我有这样一个表格:

姓名 数学 英语 科学
Amy 69 70 70
Mike 65 71 63
Jay 66 66 66

我想创建一个新列,用于计算每行在数学、英语和科学这几列中的唯一值的数量;所以这是我的期望输出:

姓名 数学 英语 科学 n_unique
Amy 69 70 70 2
Mike 65 71 63 3
Jay 66 66 66 1

对于第一行,只有两种分数69、70,所以n_unique是2,
对于第二行,有65、71、63三种分数,所以n_unique是3,
对于第三行,只有一种分数66,所以n_unique是1;
如何编写查询以在Bigquery中使用SQL创建这样的列呢?

英文:

Let' say I have a table like this:

name math english science
Amy 69 70 70
Mike 65 71 63
Jay 66 66 66

I want to create a new column which counts the number of unique value over each row in columns math,english,science;
So this is my expected output:

name math english science n_unique
Amy 69 70 70 2
Mike 65 71 63 3
Jay 66 66 66 1

For the first row, there are only two kind of score 69, 70 so n_unique is 2,
for the second row, there are 65,71,63 so n_unique is 3,
for the third row, only one score 66, so n_unique is 1;
How to write the query to create such column in Bigquery using SQL?

答案1

得分: 1

考虑以下方法:

select *, (
    select count(distinct val)
    from unnest(regexp_extract_all(format('%t', t), r'\d+')) val
  ) as n_unique
from your_table t

如果应用到你问题中的示例数据 - 输出是:

Bigquery,SQL,如何计算一行中唯一值的数量

英文:

Consider below approach

select *, (
    select count(distinct val)
    from unnest(regexp_extract_all(format('%t', t), r'\d+')) val
  ) as n_unique
from your_table t     

if applied to sample data in your question - output is

Bigquery,SQL,如何计算一行中唯一值的数量

答案2

得分: 0

你可以"解除旋转"你的表格,计算每个学生的不同成绩,然后再与你的原始表格连接:

with mytable as (
  select 'Amy' as name, 69 as math, 70 as english, 70 as science union all
  select 'Mike', 65, 71, 63 union all
  select 'Jay', 66, 66, 66
),
tmp_unpivot as (
  select * from mytable
  unpivot(grade for class in(math, english, science))
), 
agg as (
  select name, count(distinct grade) as n_unique
  from tmp_unpivot
  group by 1
)
select
  mytable.*,
  agg.n_unique
from mytable
inner join agg on mytable.name = agg.name
英文:

You can "unpivot" your table, count the distinct grades per student, and then join back to your original table:

with mytable as (
  select 'Amy' as name, 69 as math, 70 as english, 70 as science union all
  select 'Mike', 65, 71, 63 union all
  select 'Jay', 66, 66, 66
),
tmp_unpivot as (
  select * from mytable
  unpivot(grade for class in(math, english, science))
), 
agg as (
  select name, count(distinct grade) as n_unique
  from tmp_unpivot
  group by 1
)
select
  mytable.*,
  agg.n_unique
from mytable
inner join agg on mytable.name = agg.name

huangapple
  • 本文由 发表于 2023年2月14日 02:58:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/75440154.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定