如何在SQL中从每个家庭中选择一个孩子?

huangapple go评论63阅读模式
英文:

How can I pick one child from each family in SQL?

问题

我使用SQLite并有一个包含姓名列表的表,其中一些人是同一个家庭的成员。

'-----------------------------------'
| Id | Last Name | First Name | Age |
------------------------------------|
| 1  | Gordon    | James      |  5  |
| 2  | Gordon    | Mike       | 19  |
| 3  | Gordon    | Sara       |  8  |
| 4  | Gordon    | Cludia     | 25  |
| 5  | Sagget    | Bob        | 22  |
| 6  | Saywer    | Tom        |  9  |
| 7  | Saywer    | Jean       | 20  |
| 8  | Finn      | Hucklberry |  8  |
| 9  | Smith     | John       | 18  |
| 10 | Smith     | Sue        | 39  |
'-----------------------------------'

我需要选择所有年龄大于或等于18岁的人,但只从每个家庭中选择一个成员。查询应选择年龄大于或等于18岁的最年轻的人。对于上面的示例,查询应返回以下人员:

'-----------------------------------'
| Id | Last Name | First Name | Age |
------------------------------------|
| 2  | Gordon    | Mike       | 19  |
| 5  | Sagget    | Bob        | 22  |
| 7  | Saywer    | Jean       | 20  |
| 9  | Smith     | John       | 18  |
'-----------------------------------'

请注意,Hucklberry Finn不是输出的一部分,因为他不满足18岁或更大的条件,而且他没有兄弟姐妹或亲戚 :(。因此,不选择任何“Finn”。

实现上述结果的最简单查询是什么?

这是我尝试过的:

SELECT
    id,
    last_name,
    first_name,
    age
FROM
    people p
WHERE
    age >= 18
    and age < (
      select min(age) from people where age > 18 and last_name = p.last_name
)

我觉得一定有比我的尝试更正确和更高效的方法。

英文:

I use SQLite and have a table that contains a list of names, some of them are members of the same family.

'-----------------------------------'
| Id | Last Name | First Name | Age |
------------------------------------|
| 1  | Gordon    | James      |  5  |
| 2  | Gordon    | Mike       | 19  |
| 3  | Gordon    | Sara       |  8  |
| 4  | Gordon    | Cludia     | 25  |
| 5  | Sagget    | Bob        | 22  |
| 6  | Saywer    | Tom        |  9  |
| 7  | Saywer    | Jean       | 20  |
| 8  | Finn      | Hucklberry |  8  |
| 9  | Smith     | John       | 18  |
| 10 | Smith     | Sue        | 39  |
'-----------------------------------'

I need to select all people who are 18 years of age or older, but picking one member only from each family. The query should pick the youngest person who is older than or equal to 18 years old. For the above example, the query should return the following people:

'-----------------------------------'
| Id | Last Name | First Name | Age |
------------------------------------|
| 2  | Gordon    | Mike       | 19  |
| 5  | Sagget    | Bob        | 22  |
| 7  | Saywer    | Jean       | 20  |
| 9  | Smith     | John       | 18  |
'-----------------------------------'

Note that Hucklberry Finn was not part of the output since he does not satisfy the condition of 18 years or older, and he has no siblings or relatives 如何在SQL中从每个家庭中选择一个孩子? . Therefore, no 'Finn's are picked.

What is the simplest query to achieve the above result?

Here is what I tried:

SELECT
    id,
    last_name,
    first_name,
    age
FROM
    people p
WHERE
    age >= 18
    and age < (
      select min(age) from people where age > 18 and last_name = p.last_name
)

I feel there must be a more correct and efficient way than my attempt.

答案1

得分: 2

Filter the table for age >= 18 and aggregate:

选择 id、last_name、first_name,最小年龄命名为 age 的记录,从 people 表中筛选年龄大于等于 18 的记录,按 last_name 分组。

SQLite 从每个组中选取年龄最小的行。<br/>

查看 演示。<br/>

英文:

Filter the table for age &gt;= 18 and agregate:

SELECT id,
       last_name,
       first_name,
       MIN(age) AS age
FROM people
WHERE age &gt;= 18
GROUP BY last_name;

SQLite picks the row with the min age from each group.<br/>

See the demo.<br/>

答案2

得分: 1

标准SQL中的典型方法要么是对行进行排名:

SELECT id, last_name, first_name, age
FROM
(
  SELECT
    id,
    last_name,
    first_name,
    age,
    ROW_NUMBER() OVER (PARTITION BY last_name ORDER BY age) AS rn
  FROM people p
  WHERE age >= 18
) ranked
WHERE rn = 1;

要么使用聚合:

SELECT *
FROM people 
WHERE (last_name, age) IN
(
  SELECT
    last_name,
    MAX(age)
  FROM people p
  WHERE age >= 18
  GROUP BY last_name
);

或者使用 WHERE NOT EXISTS <一个年龄更小的家庭成员 >= 18>

WITH eighteen_up AS
(
  SELECT *
  FROM people 
  WHERE age >= 18
)
SELECT *
FROM eighteen_up p
WHERE NOT EXISTS
(
  SELECT null
  FROM eighteen_up younger
  WHERE younger.last_name = p.last_name
  AND younger.age < p.age
);

在SQLite中,通常可以更简单地实现,如forpas的回答所示,因为它对其他DBMS不具备的裸列有特殊处理。但是,它在处理并列情况时存在限制,我将在下面解释。因此,这不是一个安全的选择。

更新:
在请求评论中,您提到如果有两个(或更多)年满18岁的家庭成员(比如双胞胎),您希望只选择其中一个。我的第一个查询可以做到这一点(您需要使用 RANKDENSE_RANK 来获取两者)。我的第二个查询则无法按预期工作,因为它将选择两个人。我的第三个查询也选择两个人,但可以通过将 AND younger.age < p.age 替换为 AND (younger.age < p.age OR (younger.age = p.age AND younger.id < p.id)) 来调整为仅选择一个。

forpas回答中显示的仅适用于SQLite的查询不能保证能正常工作,因为它可能从不同行选择ID和名字。srinivas st的查询与我的第二个查询相同,即也会选择两行。

结论:唯一不应该在存在并列情况时使用的方法是裸列方法。根据如何处理并列情况,可以选择任何替代方法。

英文:

The typical approach in standard SQL would be to either rank the rows:

SELECT id, last_name, first_name, age
FROM
(
  SELECT
    id,
    last_name,
    first_name,
    age,
    ROW_NUMBER() OVER (PARTITION BY last_name ORDER BY age) AS rn
  FROM people p
  WHERE age &gt;= 18
) ranked
WHERE rn = 1;

or use aggregation

SELECT *
FROM people 
WHERE (last_name, age) IN
(
  SELECT
    last_name,
    MAX(age)
  FROM people p
  WHERE age &gt;= 18
  GROUP BY last_name
);

or use WHERE NOT EXISTS &lt;a younger person &gt;= 18 in the same family&gt;).

WITH eighteen_up AS
(
  SELECT *
  FROM people 
  WHERE age &gt;= 18
)
SELECT *
FROM eighteen_up p
WHERE NOT EXISTS
(
  SELECT null
  FROM eighteen_up younger
  WHERE younger.last_name = p.last_name
  AND younger.age &lt; p.age
);

In SQLite, you can usually get this much simpler, as shown in forpas' answer, because of their special treatment of bare columns that other DBMS don't feature. It has its restrictions with ties though that I am explaining below. For this reason it is not a safe option here.

Update

In the request comments you say that in case there are two (or more) youngest 18+ family members (think of twins), you want to pick only one of them. My first query does this (you'd have to use RANK or DENSE_RANK instead of ROW_NUMBER to get both). My second query does not work as desired then, because it will get both persons. My third query, too, selects both, but can be adjusted to only pick one by replacing AND younger.age &lt; p.age by AND (younger.age &lt; p.age OR (younger.age = p.age AND younger.id &lt; p.id)).

The SQLite-only query shown in forpas' answer will not be guaranteed to work, because it could pick the ID and the first name from different rows. srinivas st's query works as my second one, i.e. will select both rows, too.

Conclusion: The only approach you should never use where ties are possible is the bare-columns approach. Depending on how to deal with ties, pick any of the alternatives.

答案3

得分: 0

以下是翻译好的内容:

我认为这个查询可以帮助您获得您想要的结果

选择 t1.Id,t1.Last_Name,t1.First_Name,t1.Age
从 family t1
加入 (
选择 Last_Name,MIN(Age) AS MinAge
从家庭
其中 Age >= 18
按 Last_Name 分组
) t2 ON t1.Last_Name = t2.Last_Name AND t1.Age = t2.MinAge
其中 t1.Age >= 18;

英文:

i think this query helps you to get your desired result

SELECT t1.Id, t1.Last_Name, t1.First_Name, t1.Age
FROM family t1
JOIN (
    SELECT Last_Name, MIN(Age) AS MinAge
    FROM family
    WHERE Age &gt;= 18
    GROUP BY Last_Name
) t2 ON t1.Last_Name = t2.Last_Name AND t1.Age = t2.MinAge
WHERE t1.Age &gt;= 18;

huangapple
  • 本文由 发表于 2023年6月13日 12:51:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461778.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定