英文:
Using Min and Max to remove duplicate rows, and how to handle null values
问题
以下是已翻译的部分:
我正在创建一个Oracle中的SQL查询,尝试在一个`case`语句中使用最小和最大的聚合函数来去除重复行。以下是当前状态的代码:
select
Student_number,
case
when min(sr.racecd) = max(sr.racecd) then min(sr.racecd) else 'Two or more races'
end as races
这是输出的样子:
学生编号 种族
4322 两种或更多种族
4324 白人
当运行此代码时,它将多个行合并为一个,并将名称更改为'两种或更多种族'。但是,我遇到的问题是,当遇到空值时,它也将其更改为'两种或更多种族'。如何保留空值不变,或将它们更改为未知?另外,当我将其他列添加到查询中时,聚合函数的工作方式与仅查询学生编号和racecd时不同,为什么呢?
英文:
I am creating an SQL query in Oracle and I am trying to remove duplicate rows with a min and max aggregate function in a case statement. Here is the code at its current state:
select
Student_number,
case
when min(sr.racecd) = max(sr.racecd) then min(sr.racecd) else 'Two or more races'
end as races
this is what the output looks like
Student Number Race
4322 two or more races
4324 White
When I run the code it combines multiple rows into one and changes the name to 'two or more races'. But, the problem I am having is when it runs into a null value it changes it to 'Two or more races', too. How can I keep the Nulls as is, or change them to unknown? Also, when I add other columns in to the query the aggregate function does not work the same as when I am querying only studentnumber and racecd, why is that?
答案1
得分: 0
因为NULL不等于任何值,所以在相等性测试中失败,它会进入THEN子句。有许多解决方法。其中一个是使用COUNT而不是忽略NULL,就像这样:
select
Student_number,
CASE WHEN (COUNT(DISTINCT sr.raced) > 1) THEN 'Two or more races'
ELSE MAX(sr.racecd)
END
end as races
至于添加列的问题,当您添加未聚合的列时,您将被迫将它们包括在GROUP BY中。这将改变您查询的粒度以及每个组中包括的行,因此会影响结果。如果您想获取关于学生的更多信息,除了student_number之外,您需要对其他列进行聚合(例如使用MAX())。
英文:
Because NULL does not equal anything, so failing the equality test it goes to the THEN clause. There are a number of solutions. One is to use COUNT instead which ignores NULLs, like this:
select
Student_number,
CASE WHEN (COUNT(DISTINCT sr.raced) > 1) THEN 'Two or more races'
ELSE MAX(sr.racecd)
END
end as races
As far as the issue with adding columns, when you add columns that are not aggregated you will be forced to include them in your GROUP BY. That changes the granularity of your query and the rows included in each group, so it will impact the result. If you want more information about student besides student_number, you'll want to aggregate other columns (e.g. with MAX() ).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论