英文:
Bigquery,SQL, how to calculate number of unique value over a row
问题
让我们假设我有这样一个表格:
姓名 | 数学 | 英语 | 科学 |
---|---|---|---|
Amy | 69 | 70 | 70 |
Mike | 65 | 71 | 63 |
Jay | 66 | 66 | 66 |
我想创建一个新列,用于计算每行在数学、英语和科学这几列中的唯一值的数量;所以这是我的期望输出:
姓名 | 数学 | 英语 | 科学 | n_unique |
---|---|---|---|---|
Amy | 69 | 70 | 70 | 2 |
Mike | 65 | 71 | 63 | 3 |
Jay | 66 | 66 | 66 | 1 |
对于第一行,只有两种分数69、70,所以n_unique是2,
对于第二行,有65、71、63三种分数,所以n_unique是3,
对于第三行,只有一种分数66,所以n_unique是1;
如何编写查询以在Bigquery中使用SQL创建这样的列呢?
英文:
Let' say I have a table like this:
name | math | english | science |
---|---|---|---|
Amy | 69 | 70 | 70 |
Mike | 65 | 71 | 63 |
Jay | 66 | 66 | 66 |
I want to create a new column which counts the number of unique value over each row in columns math,english,science;
So this is my expected output:
name | math | english | science | n_unique |
---|---|---|---|---|
Amy | 69 | 70 | 70 | 2 |
Mike | 65 | 71 | 63 | 3 |
Jay | 66 | 66 | 66 | 1 |
For the first row, there are only two kind of score 69, 70 so n_unique is 2,
for the second row, there are 65,71,63 so n_unique is 3,
for the third row, only one score 66, so n_unique is 1;
How to write the query to create such column in Bigquery using SQL?
答案1
得分: 1
考虑以下方法:
select *, (
select count(distinct val)
from unnest(regexp_extract_all(format('%t', t), r'\d+')) val
) as n_unique
from your_table t
如果应用到你问题中的示例数据 - 输出是:
英文:
Consider below approach
select *, (
select count(distinct val)
from unnest(regexp_extract_all(format('%t', t), r'\d+')) val
) as n_unique
from your_table t
if applied to sample data in your question - output is
答案2
得分: 0
你可以"解除旋转"你的表格,计算每个学生的不同成绩,然后再与你的原始表格连接:
with mytable as (
select 'Amy' as name, 69 as math, 70 as english, 70 as science union all
select 'Mike', 65, 71, 63 union all
select 'Jay', 66, 66, 66
),
tmp_unpivot as (
select * from mytable
unpivot(grade for class in(math, english, science))
),
agg as (
select name, count(distinct grade) as n_unique
from tmp_unpivot
group by 1
)
select
mytable.*,
agg.n_unique
from mytable
inner join agg on mytable.name = agg.name
英文:
You can "unpivot" your table, count the distinct grades per student, and then join back to your original table:
with mytable as (
select 'Amy' as name, 69 as math, 70 as english, 70 as science union all
select 'Mike', 65, 71, 63 union all
select 'Jay', 66, 66, 66
),
tmp_unpivot as (
select * from mytable
unpivot(grade for class in(math, english, science))
),
agg as (
select name, count(distinct grade) as n_unique
from tmp_unpivot
group by 1
)
select
mytable.*,
agg.n_unique
from mytable
inner join agg on mytable.name = agg.name
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论