2023年5月24日 20:27:47go评论109阅读模式

英文:

Postgres SQL equivalent of SQL Server for distinct values with all columns

问题

PostgreSQL SQL：

select distinct on(email_addr) * 
from table1 
order by email_addr, created_date desc;

SQL Server 等效查询：

select distinct(email_addr), id, first_name, last_name, created_date 
from table1 
order by email_addr, created_date desc;

结果 - 70 个不同的值；带有所有行数据。

结果 - 160 个值；带有重复值。

我需要帮助来获取正确的SQL查询。

英文:

Postgres SQL:

select distinct on(email_addr) * 
from table1 
order by email_addr, created_date desc;

Result - 70 distinct values; with all the row data.

SQL Server equivalent:

select distinct(email_addr), id, first_name, last_name, created_date 
from table1 
order by email_addr, created_date desc;

Result - 160 values; with duplicate values.

I needed help in getting the correct SQL query.

答案1

得分: 1

DISTINCT ON 是 PostgreSQL 的一项自定义功能，在其他数据库中找不到。

更标准的方法是使用 ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date) as RN，然后仅保留具有 RN=1 的行。您需要使用公共表达式（CTE），因为 OVER 不能在 WHERE 子句中使用。

;WITH nondups AS (
    SELECT *,
        ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date DESC) as RN
    From Table1
    WHERE .....
)
SELECT * 
from nondups
where RN=1

这应该在所有支持 CTE 和 ROW_NUMBER() 的数据库上运行。这包括 MySQL 8 及更高版本。

性能可能会受到影响，因为需要在原始结果集的所有行上计算 ROW_NUMBER。这个查询在 PostgreSQL 中也可能很慢，因为需要收集、分区然后排序结果，然后才能选择第一个。

英文:

DISTINCT ON is a custom PostgreSQL feature you won't find in other databases.

A more standard way is to use ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date) as RN and keep only the rows that have RN=1. You'll have to use a CTE as OVER can't be used in the WHERE clause.

;WITH nondups AS (
    SELECT *,
        ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date DESC) as RN
    From Table1
    WHERE .....
)
SELECT * 
from nondups
where RN=1

This should run on all databases that support CTEs and ROW_NUMBER(). This includes MySQL 8 and later.

Performance may suffer though, as the ROW_NUMBER needs to be calculated on all the rows of the original result set. This query is probably slow in PostgreSQL too, because the results need to be collected, partitioned and then sorted before the first one can be selected.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

PostgreSQL中与SQL Server中的所有列不同值的等效方法

问题

答案1

为什么我在MySQL中可以成功执行，却出现了SQL语法错误？

抓取外键表上的 JOOQ 数据

无法在typeorm中使用时间戳类型的createdAt列查询记录。

mysql查询以查找销量少于10件且在发布后的7天内至少有1次销售的产品。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。