英文:
How to filter only get rows that have the minimum/earliest date per group in a column with SQL
问题
以下是您要翻译的部分:
如果我有以下表格:
ID Gender Event_Date
1 M 2015-03-01
1 M 2012-04-15
2 F 2005-11-14
2 F 2005-11-14
2 F 2005-11-14
2 F 2013-05-19
2 F 2013-05-19
3 F 2010-07-02
3 F 2004-09-09
3 F 2004-02-17
3 F 2004-02-17
4 M 2019-01-29
5 M 2006-04-04
5 M 2006-08-07
而我想要保留每个ID组中具有最早日期的所有行:
ID Gender Event_Date
1 M 2012-04-15
2 F 2005-11-14
2 F 2005-11-14
2 F 2005-11-14
3 F 2004-02-17
3 F 2004-02-17
4 M 2019-01-29
5 M 2006-04-04
如果我执行以下操作:
SELECT * FROM table
GROUP BY ID
ORDER BY Event_Date ASC
LIMIT 1;
LIMIT如何处理平局?它会保留具有最早日期的所有行还是随机选择一个?(我想保留所有行)。我还尝试了以下版本:
SELECT * FROM table
WHERE Event_Date IS Min(Event_Date)
英文:
I am trying to filter a table in SQL so that only the rows that have the earliest date are kept per ID.
If I had the following table:
ID Gender Event_Date
1 M 2015-03-01
1 M 2012-04-15
2 F 2005-11-14
2 F 2005-11-14
2 F 2005-11-14
2 F 2013-05-19
2 F 2013-05-19
3 F 2010-07-02
3 F 2004-09-09
3 F 2004-02-17
3 F 2004-02-17
4 M 2019-01-29
5 M 2006-04-04
5 M 2006-08-07
And I would like to keep all of the rows that have the earliest date, when grouped by ID:
ID Gender Event_Date
1 M 2012-04-15
2 F 2005-11-14
2 F 2005-11-14
2 F 2005-11-14
3 F 2004-02-17
3 F 2004-02-17
4 M 2019-01-29
5 M 2006-04-04
If I did the following:
SELECT * FROM table
GROUP BY ID
ORDER BY Event_Date ASC
LIMIT 1;
How does LIMIT handle the ties? Would it keep all of the rows with earliest date or just pick one at random? (I want to keep all of them). I also tried versions of this:
SELECT * FROM table
WHERE Event_Date IS Min(Event_Date)
答案1
得分: 1
以下是已翻译的内容:
一种标准方法是使用 RANK()
窗口函数:
WITH cte AS (
SELECT t.*, RANK() OVER (PARTITION BY ID ORDER BY Event_Date) rnk
FROM yourTable t
)
SELECT ID, Gender, Event_Date
FROM cte
WHERE rnk = 1
ORDER BY ID;
如果您的数据库不支持 RANK()
,那么我们可以使用存在性逻辑:
SELECT ID, Gender, Event_Date
FROM yourTable t1
WHERE NOT EXISTS (
SELECT 1
FROM yourTable t2
WHERE t2.ID = t1.ID AND
t2.Event_Date < t1.Event_Date
);
英文:
One canonical approach here would be to use the RANK()
window function:
<!-- language: sql -->
WITH cte AS (
SELECT t.*, RANK() OVER (PARTITION BY ID ORDER BY Event_Date) rnk
FROM yourTable t
)
SELECT ID, Gender, Event_Date
FROM cte
WHERE rnk = 1
ORDER BY ID;
If your database does not have RANK()
, then we can use exists logic:
<!-- language: sql -->
SELECT ID, Gender, Event_Date
FROM yourTable t1
WHERE NOT EXISTS (
SELECT 1
FROM yourTable t2
WHERE t2.ID = t1.ID AND
t2.Event_Date < t1.Event_Date
);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论