英文:
SQL : select distinct from two columns and min from the third colum
问题
以下是您要翻译的内容:
我的数据集看起来像这样
|customer|product|purchase-date
|--------|--------|----------|
|vik|office|21-04-2022
|vik|office2|21-02-2021
|vik|office|21-01-2021
|vik|office2|21-02-2023
|abc|office|1-02-2022
|abc|office|1-02-2021
我想运行一个SQL查询,返回这个输出
|customer|product|purchase-date
|--------|--------|----------|
vik|office|21-01-2021
vik|office2|21-02-2021
abc|office|1-02-2021
我想有效地使用select distinct来选择customer和product,然后找到该不同组合的最小购买日期,我已经尝试了select语句,尝试了分区,但输出并不如预期。
当我使用分区时,整个列中的最小日期被选中,而不是在每个不同的customer和product组合中选择最小日期。
任何帮助将不胜感激。
以下是您提供的SQL查询的翻译:
具有数据的数据 as
(
选择
*,
行号() over (按购买日期排序,按customer分区) as 行数
从
表
)
选择 *
从 数据
其中 行数 = 1
请注意,由于您要求只返回翻译的部分,我已经忽略了您不需要翻译的部分。如果您需要进一步的帮助,请随时提出问题。
英文:
My data set looks like this
customer | product | purchase-date |
---|---|---|
vik | office | 21-04-2022 |
vik | office2 | 21-02-2021 |
vik | office | 21-01-2021 |
vik | office2 | 21-02-2023 |
abc | office | 1-02-2022 |
abc | office | 1-02-2021 |
I want to run a SQL query that returns this output
customer | product | purchase-date |
---|---|---|
vik | office | 21-01-2021 |
vik | office2 | 21-02-2021 |
abc | office | 1-02-2021 |
I effectively want to use select distinct on customer and product and find the min purchase date for that distinct combination, I have tried select statements, tried partitions but the output isn't returned as expected.
When I use partitions, the minimum date from the entire column gets picked up as opposed to min within each distinct customer & product combination.
Any help would be appreciated.
with data as
(
select
*,
row_number() over (partition by customer order by purchase-date) as rownum
from
table
)
select *
from data
where rownum = 1
This is supposed to return a rownum column and then I run a query with a where
clause rownum = 1
, but when I do that it picks up the lowest value of purchase-date
and not lowest in each
答案1
得分: 1
select distinct
不允许进行最小值等计算,但 GROUP BY
可以,同时它还创建了不同的组合:
SELECT
customer, product, MIN(purchase-date) as purchase-date
FROM t
GROUP BY
customer, product
将您想要作为不同项的列放在 group by 子句下,并将它们以及 MIN、MAX(或 SUM、COUNT 等)放在 select 子句中。
要使用 row_number()
来实现相同的效果,您需要在 partition by
元素中指定所有需要 "distinct" 的列:
SELECT
*
FROM (
SELECT
customer, product, purchase-date
, row_number() over(partition by customer, product
order by purchase-date ASC) as rn
FROM t
) d
WHERE rn = 1
对于每个不同的 customer、product 组合,行编号 (rn) 重新开始计数,而获得行号 1 的特定行是具有最早日期的行(或它们的日期相同时可能是任意一个)。每个分区只能有一行获得行号 1。
英文:
select distinct
does not allow calculations such as minimum, but GROUP BY
can, while it also creates distinct combinations:
SELECT
customer, product, MIN(purchase-date) as purchase-date
FROM t
GROUP BY
customer, product
Put the columns that you want as the distinct items under the group by clause, and put those plus the MIN or MAX (or SUM COUNT etc.) in the select clause.
To use row_number()
to achieve the equivalent, you need to specify all the columns that you need to be "distinct" in the partition by
element
SELECT
*
FROM (
SELECT
customer, product, purchase-date
, row_number() over(partition by customer, product
order by purchase-date ASC) as rn
FROM t
) d
WHERE rn = 1
Row numbering (re)commences at 1 for each distinct combination of customer, product, and the specific row that will get row 1 is the row with the earliest date (or it could be equal earliest). Only 1 row for each partition can get the row number 1.
Tests:
CREATE TABLE t (
customer VARCHAR(50),
product VARCHAR(50),
purchase_date DATE
);
INSERT INTO t (customer, product, purchase_date)
VALUES
('vik', 'office', '2022-04-21'),
('vik', 'office2', '2021-02-21'),
('vik', 'office', '2021-01-21'),
('vik', 'office2', '2023-02-21'),
('abc', 'office', '2022-02-01'),
('abc', 'office', '2021-02-01');
SELECT
customer, product, MIN(purchase_date) as purchase_date
FROM t
GROUP BY
customer, product
customer | product | purchase_date |
---|---|---|
abc | office | 2021-02-01 |
vik | office | 2021-01-21 |
vik | office2 | 2021-02-21 |
SELECT
*
FROM (
SELECT
customer, product, purchase_date
, row_number() over(partition by customer, product
order by purchase_date ASC) as rn
FROM t
) d
WHERE rn = 1
customer | product | purchase_date | rn |
---|---|---|---|
abc | office | 2021-02-01 | 1 |
vik | office | 2021-01-21 | 1 |
vik | office2 | 2021-02-21 | 1 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论