SQL:从两列中选择不同的值,并从第三列中选择最小值。

huangapple go评论66阅读模式
英文:

SQL : select distinct from two columns and min from the third colum

问题

以下是您要翻译的内容:

我的数据集看起来像这样

|customer|product|purchase-date
|--------|--------|----------|
|vik|office|21-04-2022
|vik|office2|21-02-2021
|vik|office|21-01-2021
|vik|office2|21-02-2023
|abc|office|1-02-2022
|abc|office|1-02-2021

我想运行一个SQL查询,返回这个输出

|customer|product|purchase-date
|--------|--------|----------|
vik|office|21-01-2021
vik|office2|21-02-2021
abc|office|1-02-2021

我想有效地使用select distinct来选择customer和product,然后找到该不同组合的最小购买日期,我已经尝试了select语句,尝试了分区,但输出并不如预期。

当我使用分区时,整个列中的最小日期被选中,而不是在每个不同的customer和product组合中选择最小日期。

任何帮助将不胜感激。

以下是您提供的SQL查询的翻译:

具有数据的数据 as 
(
    选择 
        *, 
        行号() over (按购买日期排序,按customer分区) as 行数
     
        
)
选择 * 
 数据 
其中 行数 = 1

请注意,由于您要求只返回翻译的部分,我已经忽略了您不需要翻译的部分。如果您需要进一步的帮助,请随时提出问题。

英文:

My data set looks like this

customer product purchase-date
vik office 21-04-2022
vik office2 21-02-2021
vik office 21-01-2021
vik office2 21-02-2023
abc office 1-02-2022
abc office 1-02-2021

I want to run a SQL query that returns this output

customer product purchase-date
vik office 21-01-2021
vik office2 21-02-2021
abc office 1-02-2021

I effectively want to use select distinct on customer and product and find the min purchase date for that distinct combination, I have tried select statements, tried partitions but the output isn't returned as expected.

When I use partitions, the minimum date from the entire column gets picked up as opposed to min within each distinct customer & product combination.

Any help would be appreciated.

with data as 
(
    select 
        *, 
        row_number() over (partition by customer order by purchase-date) as rownum
    from 
        table
)
select * 
from data 
where rownum = 1

This is supposed to return a rownum column and then I run a query with a where clause rownum = 1, but when I do that it picks up the lowest value of purchase-date and not lowest in each

答案1

得分: 1

select distinct 不允许进行最小值等计算,但 GROUP BY 可以,同时它还创建了不同的组合:

SELECT 
    customer, product, MIN(purchase-date) as purchase-date
FROM t
GROUP BY
    customer, product

将您想要作为不同项的列放在 group by 子句下,并将它们以及 MIN、MAX(或 SUM、COUNT 等)放在 select 子句中。

要使用 row_number() 来实现相同的效果,您需要在 partition by 元素中指定所有需要 "distinct" 的列:

SELECT 
     *
FROM (
    SELECT 
         customer, product, purchase-date
       , row_number() over(partition by customer, product 
                           order by purchase-date ASC) as rn
    FROM t
     ) d
WHERE rn = 1

对于每个不同的 customer、product 组合,行编号 (rn) 重新开始计数,而获得行号 1 的特定行是具有最早日期的行(或它们的日期相同时可能是任意一个)。每个分区只能有一行获得行号 1。

英文:

select distinct does not allow calculations such as minimum, but GROUP BY can, while it also creates distinct combinations:

SELECT 
    customer, product, MIN(purchase-date) as purchase-date
FROM t
GROUP BY
    customer, product

Put the columns that you want as the distinct items under the group by clause, and put those plus the MIN or MAX (or SUM COUNT etc.) in the select clause.

To use row_number() to achieve the equivalent, you need to specify all the columns that you need to be "distinct" in the partition by element

SELECT 
     *
FROM (
    SELECT 
         customer, product, purchase-date
       , row_number() over(partition by customer, product 
                           order by purchase-date ASC) as rn
    FROM t
     ) d
WHERE rn = 1

Row numbering (re)commences at 1 for each distinct combination of customer, product, and the specific row that will get row 1 is the row with the earliest date (or it could be equal earliest). Only 1 row for each partition can get the row number 1.


Tests:

CREATE TABLE t (
  customer VARCHAR(50),
  product VARCHAR(50),
  purchase_date DATE
);
 
INSERT INTO t (customer, product, purchase_date)
VALUES
  ('vik', 'office', '2022-04-21'),
  ('vik', 'office2', '2021-02-21'),
  ('vik', 'office', '2021-01-21'),
  ('vik', 'office2', '2023-02-21'),
  ('abc', 'office', '2022-02-01'),
  ('abc', 'office', '2021-02-01');

SELECT 
    customer, product, MIN(purchase_date) as purchase_date
FROM t
GROUP BY
    customer, product
customer product purchase_date
abc office 2021-02-01
vik office 2021-01-21
vik office2 2021-02-21
SELECT 
     *
FROM (
    SELECT 
         customer, product, purchase_date
       , row_number() over(partition by customer, product 
                           order by purchase_date ASC) as rn
    FROM t
     ) d
WHERE rn = 1
customer product purchase_date rn
abc office 2021-02-01 1
vik office 2021-01-21 1
vik office2 2021-02-21 1

fiddle

huangapple
  • 本文由 发表于 2023年7月18日 10:58:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76709267.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定