英文:
A simpler way to first item from the menu purchased by each customer
问题
以下是已翻译的内容:
这些是我拥有的表格:
sales
(
"customer_id" VARCHAR(1),
"order_date" DATE,
"product_id" INTEGER
);
menu
(
"product_id" INTEGER,
"product_name" VARCHAR(5),
"price" INTEGER
);
我试图找出每位顾客购买的菜单中的第一件物品是什么。所以经过几个小时的努力,我得出了这个解决方案:
WITH cte_product AS
(
SELECT
sales.customer_id, menu.product_name,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY customer_id) row_a
FROM
sales
INNER JOIN
menu ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1
实际上,这确实只返回了每位顾客按日期购买的第一个订单。
尽管我很高兴有了答案,但我想知道是否还有其他方法可以做到这一点。也许有一种更简单的方式吗?
我之前尝试过DISTINCT
、ORDER BY
、WHERE
和GROUP BY
,但没有成功,但我是新手,所以我在这里问你们,这是否已经足够好,还是有更好的方法可以做到这一点。
英文:
These are the tables I have:
sales
(
"customer_id" VARCHAR(1),
"order_date" DATE,
"product_id" INTEGER
);
menu
(
"product_id" INTEGER,
"product_name" VARCHAR(5),
"price" INTEGER
);
I'm trying to find what was the first item from the menu purchased by each customer. So after a couple of hours I got to this solution:
WITH cte_product AS
(
SELECT
sales.customer_id, menu.product_name,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY customer_id) row_a
FROM
sales
INNER JOIN
menu ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1
And this in fact gives me back only the first order each customer made by date.
Although I'm happy to have an answer, I'm wondering if there is another way to do this. A more simple way perhaps?
I previously tried DISTINCT
, ORDER BY
, WHERE
and GROUP BY
without success, but I'm new in this, so I'm here to ask you guys if this is a good enough way or if there is a better way to do it.
答案1
得分: 4
Using window functions for this task is the most efficient method. However, there's a small issue with your window function:
PARTITION BY customer_id
用于为每个不同的客户创建排名。ORDER BY customer_id
用于按照 customer_id 值在分区内排序。但在分区内,customer_id 始终具有相同的值:目前你的输出恰好正确。这就是为什么你应该使用 "order_date" 而不是 "customer_id" 的原因。
另一种几乎同样高效的方法,不使用子查询,是使用 TOP(n)
。这将根据规定的排序获取前 n 个值。由于我们想要所有具有行号 = 1 的记录,我们可以使用 TOP(1) WITH TIES
,或者翻译过来,所有行号为 1(并列)的记录。
英文:
Employing window functions for this kind of task is the most performant way of doing it. Although there's a tiny issue with your window function:
PARTITION BY customer_id
will allow you to make a ranking for each of your distinct customerORDER BY customer_id
will order on your customer_id values, within your partition. But inside your partitions, customer_id will always have the same value: your output is the right one by chance at the moment. This is the reason why you should use "order_date" instead of "customer_id".
WITH cte_product AS (
SELECT sales.customer_id,
menu.product_name,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) row_a
FROM sales
INNER JOIN menu
ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1
Another almost equally performant way of doing it, without subqueries this time, is using TOP(n)
. This will do is getting your top n values given the imposed ordering. Since we want all records that have row-number = 1, we can use TOP(1) WITH TIES
, or translated, all records whose row-number is 1 (tied).
SELECT TOP(1) WITH TIES
sales.customer_id,
menu.product_name
FROM sales
INNER JOIN menu
ON sales.product_id = menu.product_id
ORDER BY ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date)
答案2
得分: 1
这是使用 group by
和聚合函数 min()
获取每位客户首次订单的另一种方法:
首先,你需要找到每位客户的首次订单日期:
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
然后,你需要将 sales
表与这个数据集连接以获取 product_id
:
select s.*
from sales s
inner join (
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
) as t on s.customer_id = t.customer_id and s.order_date = t.min_order_date;
接着:
select s.*, m.product_name
from sales s
inner join (
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
) as t on s.customer_id = t.customer_id and s.order_date = t.min_order_date
inner join menu m on m.product_id = s.product_id;
英文:
This is an other way to get first order per customer using group by
and the aggregate function min()
:
First you need to find first order date :
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
Then you need to join sales
table with this dataset to get product_id
:
select s.*
from sales s
inner join (
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
) as t on s.customer_id = s.customer_id and s.order_date = t.min_order_date;
Then :
select s.*, m.product_name
from sales s
inner join (
select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id
) as t on s.customer_id = s.customer_id and s.order_date = t.min_order_date
inner join menu m on m.product_id = s.product_id;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论