PostgreSQL:按一个字段去重并按另一个字段排序

huangapple go评论60阅读模式
英文:

Postgresql: distinct by one field and order by another

问题

I am following along the example of this excellent question and first answer:

https://stackoverflow.com/questions/586781/postgresql-fetch-the-rows-which-have-the-max-value-for-a-column-in-each-group

But I need to do something slightly different. I want to select distinctly based on events.uuid, because events.start might not be unique. But I want to order by events.start because uuid's are generated in no particular order. Database doesn't like it. What is a good way of doing what I really want to do?

SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start;

In English I would state it like this: "For each event, give me exactly one play, and order the results by event start time."

英文:

I am following along the example of this excellent question and first answer:

https://stackoverflow.com/questions/586781/postgresql-fetch-the-rows-which-have-the-max-value-for-a-column-in-each-group

But I need to do something slightly different. I want to select distinctly based on events.uuid, because events.start might not be unique. But I want to order by events.start because uuid's are generated in no particular order. Database doesn't like it. What is a good way of doing what I really want to do?

SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start;

In English I would state it like this: "For each event, give me exactly one play, and order the results by event start time."

答案1

得分: 1

如果uuid是events表的主键,那么你很幸运,因为这个集合:

SELECT DISTINCT ON (events.start, events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start, events.uuid;

与以下查询返回的结果相同:

SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.uuid;

只是顺序不同。
在distinct子句中同时包含events.start和events.uuid将意味着这两者的集合是唯一的,但如果uuid已经在每个事件行中是唯一的,那么将其他列从events表添加到distinct on子句中不应该改变结果集。

英文:

If the uuid is a primary key for the events table, then you're in luck because the set:

SELECT DISTINCT ON (events.start, events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start, events.uuid;

Is the same as that returned by:

SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.uuid;

But just in a different order.
Having both events.start and events.uuid in the distinct clause will mean the set of the two is unique, but if uuid is already unique per event row, then adding other columns from the events table to the distinct on clause should not change the result set.

答案2

得分: 0

你只需要两个 ORDER BY,这意味着你需要两个选择级别:

SELECT * FROM (
    SELECT DISTINCT ON (events.uuid) events.nickname, plays.id, events.start
    FROM events
    JOIN plays ON plays.eventuuid = events.uuid
    ORDER BY events.uuid
) foo 
ORDER BY start;

实际上,你可以完全省略内部的 ORDER BY,因为它会被隐含。但是,隐含的 ORDER BY 仍然需要在不同的查询级别上,而不是在显式的查询上。

从 v15 开始,虚拟别名(这里的 "foo")也可以省略。

英文:

You just need two ORDER BYs, which means you need two levels of select:

SELECT * from (
    SELECT DISTINCT ON (events.uuid) events.nickname, plays.id, events.start
    FROM events
    JOIN plays on plays.eventuuid = events.uuid
    ORDER BY events.uuid
) foo 
ORDER BY start;

You can actually omit the inner ORDER BY altogether, as it will be implied. But the implied ORDER BY still needs to be on a different query level than the explicit one.

Starting in v15, the dummy alias ("foo" here) can also be omitted.

huangapple
  • 本文由 发表于 2023年2月23日 22:19:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75546059.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定