在Postgres中仅获取每种类型的最高计数。

huangapple go评论57阅读模式
英文:

Getting only the highest counts per type in Postgres

问题

以下是您要翻译的内容:

"假设使用 Postgres 15 作为数据库引擎,我要如何查询以获取每个孩子最受欢迎的玩具类型的列表,按孩子分组。因此,结果应该是:

kid_name toy_type count
Edward bear 3
Lydia car 2

请注意,这只是原文的中文翻译,不包括代码部分。

英文:

Say I have a table with kids and their toys.

CREATE TABLE kids_toys (
  kid_name character varying,
  toy_type character varying,
  toy_name character varying
);
kid_name toy_type toy_name
Edward bear Pooh
Edward bear Pooh2
Edward bear Simba
Edward car Vroom
Lydia doll Sally
Lydia car Beeps
Lydia car Speedy
Edward car Red

I want to get a list of the the most popular toy type for each kid, grouped by kid. So the result would be

kid_name toy_type count
Edward bear 3
Lydia car 2

Assuming Postgres 15 as the engine, how would I query to do this? I keep getting stuck on how to generate the count but then only take the max result from each per-kid count.

答案1

得分: 3

在Postgres中,我建议使用distinct on,它可以在一次查询中完成任务:

select distinct on (kid_name) kid_name, toy_type, count(*) cnt
from kids_toys
group by kid_name, toy_type
order by kid_name, count(*) desc, toy_type

这个查询通过孩子和玩具对数据集进行分组。然后,distinct on 确保每个孩子只返回一条记录;order by 子句将每个孩子最受欢迎的玩具排在前面。如果有平局,将选择第一个玩具(按字母顺序)。


如果你想保留平局(Postgres的distinct on不能做到这一点),我们可以改用rank()fetch with ties

select kid_name, toy_type, count(*) cnt
from kids_toys
group by kid_name, toy_type
order by rank() over(partition by kid_name order by count(*) desc)
fetch first row with ties

不保留平局:这个查询通过孩子和玩具对数据集进行分组,并根据每个孩子的玩具数量降序排列,然后选择排名第一的记录。

英文:

In Postgres, I would recommend distinct on, which can get the job done in a single pass:

select distinct on (kid_name) kid_name, toy_type, count(*) cnt
from kids_toys
group by kid_name, toy_type
order by kid_name, count(*) desc, toy_type

The query groups the dataset by kid and toy. Then distinct on ensures that only one record is returned for each kid; the order by clause puts the most popular toy of each kid first. If there are ties, the first toy is picked (alphabetically).


If you wanted to retain ties (which Postgres' distinct on cannot do), we could use rank() and fetch with ties instead:

select kid_name, toy_type, count(*) cnt
from kids_toys
group by kid_name, toy_type
order by rank() over(partition by kid_name order by count(*) desc)
fetch first row with ties

答案2

得分: 2

首先,按照 kid_nametoy_type 进行分组,以找出每种类型的玩具每个孩子有多少个。

然后,添加一个仅以 kid_name 为分区条件,按 count 降序排序的 row_number 窗口函数,以找出每个孩子每种玩具的位置,从最高数量到最低数量。

最后,仅筛选出 row_num = 1 的记录。

此外,如果你想要每个孩子的前3个玩具,你可以使用 row_num <= 3

select kid_name, toy_type, cnt
from
(select kid_name, toy_type, cnt, row_number() over(partition by kid_name order by cnt desc) as row_num
  from (
    select kid_name, toy_type, count(*) as cnt
    from kids_toys
    group by kid_name, toy_type
  ) as grouped
) as with_row_num
where row_num = 1
英文:

First, group by kid_name and toy_type to find how many toys the kid has from each type.

Then, add a row_number window function partitioned only by the kid_name and order by the count descending to find the position of each toy_type from highest count to lowest for each individual kid

And lastly, filter only the records with row_num = 1

Also, if you would like the top 3 toys per kid for example, you can use row_num &lt;= 3 instead

select kid_name, toy_type, cnt
from
(select kid_name, toy_type, cnt, row_number() over(partition by kid_name order by cnt desc) as row_num
  from (
    select kid_name, toy_type, count(*) as cnt
    from kids_toys
    group by kid_name, toy_type
  ) as grouped
) as with_row_num
where row_num = 1

huangapple
  • 本文由 发表于 2023年6月27日 21:59:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76565641.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定