PostgreSQL中与SQL Server中的所有列不同值的等效方法

huangapple go评论69阅读模式
英文:

Postgres SQL equivalent of SQL Server for distinct values with all columns

问题

PostgreSQL SQL:

select distinct on(email_addr) * 
from table1 
order by email_addr, created_date desc;

SQL Server 等效查询:

select distinct(email_addr), id, first_name, last_name, created_date 
from table1 
order by email_addr, created_date desc;

结果 - 70 个不同的值;带有所有行数据。

结果 - 160 个值;带有重复值。

我需要帮助来获取正确的SQL查询。

英文:

Postgres SQL:

select distinct on(email_addr) * 
from table1 
order by email_addr, created_date desc;

Result - 70 distinct values; with all the row data.

SQL Server equivalent:

select distinct(email_addr), id, first_name, last_name, created_date 
from table1 
order by email_addr, created_date desc;

Result - 160 values; with duplicate values.

I needed help in getting the correct SQL query.

答案1

得分: 1

DISTINCT ON 是 PostgreSQL 的一项自定义功能,在其他数据库中找不到。

更标准的方法是使用 ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date) as RN,然后仅保留具有 RN=1 的行。您需要使用公共表达式(CTE),因为 OVER 不能在 WHERE 子句中使用。

;WITH nondups AS (
    SELECT *,
        ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date DESC) as RN
    From Table1
    WHERE .....
)
SELECT * 
from nondups
where RN=1

这应该在所有支持 CTE 和 ROW_NUMBER() 的数据库上运行。这包括 MySQL 8 及更高版本。

性能可能会受到影响,因为需要在原始结果集的所有行上计算 ROW_NUMBER。这个查询在 PostgreSQL 中也可能很慢,因为需要收集、分区然后排序结果,然后才能选择第一个。

英文:

DISTINCT ON is a custom PostgreSQL feature you won't find in other databases.

A more standard way is to use ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date) as RN and keep only the rows that have RN=1. You'll have to use a CTE as OVER can't be used in the WHERE clause.

;WITH nondups AS (
    SELECT *,
        ROW_NUMBER() OVER(PARTITION BY Email_Addr ORDER BY Created_Date DESC) as RN
    From Table1
    WHERE .....
)
SELECT * 
from nondups
where RN=1

This should run on all databases that support CTEs and ROW_NUMBER(). This includes MySQL 8 and later.

Performance may suffer though, as the ROW_NUMBER needs to be calculated on all the rows of the original result set. This query is probably slow in PostgreSQL too, because the results need to be collected, partitioned and then sorted before the first one can be selected.

huangapple
  • 本文由 发表于 2023年5月24日 20:27:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76323552.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定