2023年8月4日 08:26:00go评论81阅读模式

英文:

SQL: Counting the Number of Distinct Schools a Student Attended

问题

以下是翻译好的部分:

我的问题: 我想要找出：

对于任何在2010年上了“学校A”并且GPA大于5的学生
在2010年至2015年之间，这些学生上了多少所不同的学校？

这是我尝试为这个问题编写的查询 - 我首先确定了在2010年至2015年之间的所有行的子集，然后我“标记”了在2010年上了“学校A”并且GPA大于5的学生。最后，我使用了一个连接将所有这些内容合并在一起 - 并且使用了两层计数聚合来获得最终答案：

with cte_a as
(select * from my_table where school_year >= 2010 and school_year <= 2015),

cte_b as
(select distinct student_id from cte_a
where school = 'School A' and gpa > 5 and year = 2010)

select count_distinct_schools, count(student_id) from(
select t.student_id, count(distinct school) as count_distinct_schools
from my_table t
join cte_b
on t.student_id = cte_b.student_id
group by t.student_id)a
group by count_distinct_schools;

在最后一部分代码中，您应该使用cte_b，因为您已经通过cte_b子查询标记了符合条件的学生。这将确保您获得符合您问题要求的正确答案。

希望这可以帮助您正确执行查询，并获得您期望的结果格式。

英文:

I am working with Netezza SQL.

I have the following table (my_table) of student GPA over different years and what school they attended:

    student_id year gpa school_year
1           1 2010   6    School A
2           1 2011   4    School B
3           1 2012   4    School A
4           2 2010   5    School A
5           2 2011   1    School B
6           2 2012   2    School B
7           3 2009   5    School A
8           3 2010   6    School A
9           3 2011   5    School A
10          4 2010   5    School A
11          4 2011   7    School B
12          4 2014   6    School C
13          5 2010   4    School C
14          5 2011   2    School A
15          5 2012   2    School A

My question: I want to find out:

For any student that attended "School A" in 2010 and had a GPA > 5
Between the years 2010-2015, how many distinct schools did these students attend?

Here is my attempt to write a query for this problem - I first identified a subset of all rows between 2010 and 2015, then I "tagged" students who attended School A in 2010 and had a GPA > 5. Finally, I used a join to bring all this together - and a two layered count aggregation to get the final answer:

with cte_a as
(select * from my_table where school_year &gt;= 2010 and school_year&lt;=2015),

cte_b as
(select distinct student_id from cte_a
where school = &#39;School A&#39; and gpa&gt;5 and school_year = 2010)

select count_1, count(student_id) from(
select t.student_id, count(distinct school) as count_1
from my_table t
join cte_b
on t.student_id = cte_b.student_id
group by t.student_id)a
group by count_1;

I am confused if in the last chunk of code, do I need to use cte_b or my_table.

Can someone please show me how to do this correctly? In the end, I am expecting the final answer in this format:

  count_distinct_schools      count(student_id)
       2                           1
       1                           1

Thanks!

答案1

得分: 2

不确定为什么需要双重汇总，但这样应该可以工作：

SELECT t.student_id, COUNT(DISTINCT school)
FROM my_table t
JOIN (
    SELECT DISTINCT student_id
    FROM my_table
    WHERE school = '学校A' AND gpa > 5 AND year = 2010
) s ON s.student_id = t.student_id
WHERE t.year BETWEEN 2010 AND 2015
GROUP BY t.student_id

英文:

I am not sure why you need double aggregation, but shouldn't this work?

SELECT t.student_id, COUNT(DISTINCT school)
FROM my_table t
JOIN (
	SELECT DISTINCT student_id
	FROM my_table
	WHERE school = &#39;School A&#39; AND gpa &gt; 5 AND year = 2010
) s ON s.student_id = t.student_id
WHERE t.year BETWEEN 2010 AND 2015
GROUP BY t.student_id

答案2

得分: 2

起初，我们找到了在2010年就读于“学校A”且GPA大于5的学生，使用CTE_Students。

然后在主表中找到所有具有找到的学生ID的行，并对它们进行分组并计算学校数量。

WITH
CTE_Students
AS
(
	SELECT student_id
	FROM my_table
	WHERE
		school = 'School A' 
		AND gpa > 5 
		AND school_year = 2010
)
SELECT
	my_table.student_id
	,COUNT(DISTINCT school) AS SchoolCount
FROM
	my_table
WHERE
	my_table.student_id IN (SELECT CTE_Students.student_id FROM CTE_Students)
    AND my_table.year >= 2010
    AND my_table.year <= 2015
GROUP BY
	my_table.student_id
;

如果您使用IN而不是JOIN，那么在第一个CTE中不需要获取学生ID的DISTINCT列表。

实际上，这并不重要。即使您将my_table与CTE_Students连接，并且CTE_Students具有相同学生ID的多个行，最终学校的计数是DISTINCT的，因此所有可能的重复项都将被消除。

因此，在查找在2010年就读于“学校A”且GPA大于5的学生时，无需使用DISTINCT，在COUNT()中的DISTINCT会处理它。

英文:

At first we find students that attended "School A" in 2010 and had a GPA > 5 using CTE_Students.

Then in the main table find all rows with the found student IDs and group them and count the schools.

WITH
CTE_Students
AS
(
	SELECT student_id
	FROM my_table
	WHERE
		school = &#39;School A&#39; 
		and gpa &gt; 5 
		and school_year = 2010
)
SELECT
	my_table.student_id
	,COUNT(DISTINCT school) AS SchoolCount
FROM
	my_table
WHERE
	my_table.student_id IN (SELECT CTE_Students.student_id FROM CTE_Students)
    AND my_table.year &gt;= 2010
    AND my_table.year &lt;= 2015
GROUP BY
	my_table.student_id
;

If you use IN instead of JOIN, then it is not necessary to get DISTINCT list of student IDs in the first CTE.

Actually, it doesn't matter. Even if you join my_table with the CTE_Students and CTE_Students has several rows with the same student ID, in the end the count of schools is DISTINCT, so all possible duplicates will be eliminated.

So, there is no need to have DISTINCT when finding students that attended "School A" in 2010 and had a GPA > 5, the DISTINCT in the COUNT() will take care of it.

答案3

得分: 1

我认为这应该可以工作。

;WITH cte
AS (
	SELECT *
	FROM my_table s
	WHERE s.school = 'school_a'
		AND s.school_year = 2010
		AND s.gpa > 5.0
	)
SELECT s.student_id
	,count(DISTINCT s.school) AS NumOfDistinctSchools
FROM my_table s
WHERE s.student_id IN (
		SELECT t.student_id
		FROM cte t
		)
	AND s.school_year <= 2015
	AND s.school_year >= 2010
GROUP BY s.student_id

英文:

I think this should work.

;WITH cte
AS (
	SELECT *
	FROM my_table s
	WHERE s.school = &#39;school_a&#39;
		AND s.school_year = 2010
		AND s.gpa &gt; 5.0
	)
SELECT s.student_id
	,count(DISTINCT s.school) AS NumOfDistinctSchools
FROM my_table s
WHERE s.student_id IN (
		SELECT t.student_id
		FROM cte t
		)
	AND s.school_year &lt;= 2015
	AND s.school_year &gt;= 2010
GROUP BY s.student_id

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

SQL：计算学生所参加的不同学校的数量

问题

答案1

答案2

答案3

如何在Java中使用正则表达式替换SQL查询中的表名？

如何从表a中获取插入的ID，并将该ID用于插入表b。

SQL 选择具有最高版本的行

在PostgreSQL中查询日期范围，不包括周末。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论