A simpler way to first item from the menu purchased by each customer

huangapple go评论61阅读模式
英文:

A simpler way to first item from the menu purchased by each customer

问题

以下是已翻译的内容:

这些是我拥有的表格:

sales 
(
    "customer_id" VARCHAR(1),
    "order_date" DATE,
    "product_id" INTEGER
);

menu 
(
    "product_id" INTEGER,
    "product_name" VARCHAR(5),
    "price" INTEGER
);

我试图找出每位顾客购买的菜单中的第一件物品是什么。所以经过几个小时的努力,我得出了这个解决方案:

WITH cte_product AS
(
    SELECT 
        sales.customer_id, menu.product_name, 
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY customer_id) row_a
    FROM 
        sales
    INNER JOIN 
        menu ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1

实际上,这确实只返回了每位顾客按日期购买的第一个订单。

尽管我很高兴有了答案,但我想知道是否还有其他方法可以做到这一点。也许有一种更简单的方式吗?

我之前尝试过DISTINCTORDER BYWHEREGROUP BY,但没有成功,但我是新手,所以我在这里问你们,这是否已经足够好,还是有更好的方法可以做到这一点。

英文:

These are the tables I have:

sales 
(
    "customer_id" VARCHAR(1),
    "order_date" DATE,
    "product_id" INTEGER
);

menu 
(
    "product_id" INTEGER,
    "product_name" VARCHAR(5),
    "price" INTEGER
);

I'm trying to find what was the first item from the menu purchased by each customer. So after a couple of hours I got to this solution:

WITH cte_product AS
(
    SELECT 
        sales.customer_id, menu.product_name, 
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY customer_id) row_a
    FROM 
        sales
    INNER JOIN 
        menu ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1

And this in fact gives me back only the first order each customer made by date.

Although I'm happy to have an answer, I'm wondering if there is another way to do this. A more simple way perhaps?

I previously tried DISTINCT, ORDER BY, WHERE and GROUP BY without success, but I'm new in this, so I'm here to ask you guys if this is a good enough way or if there is a better way to do it.

答案1

得分: 4

Using window functions for this task is the most efficient method. However, there's a small issue with your window function:

  • PARTITION BY customer_id 用于为每个不同的客户创建排名。
  • ORDER BY customer_id 用于按照 customer_id 值在分区内排序。但在分区内,customer_id 始终具有相同的值:目前你的输出恰好正确。这就是为什么你应该使用 "order_date" 而不是 "customer_id" 的原因。

另一种几乎同样高效的方法,不使用子查询,是使用 TOP(n)。这将根据规定的排序获取前 n 个值。由于我们想要所有具有行号 = 1 的记录,我们可以使用 TOP(1) WITH TIES,或者翻译过来,所有行号为 1(并列)的记录。

英文:

Employing window functions for this kind of task is the most performant way of doing it. Although there's a tiny issue with your window function:

  • PARTITION BY customer_id will allow you to make a ranking for each of your distinct customer
  • ORDER BY customer_id will order on your customer_id values, within your partition. But inside your partitions, customer_id will always have the same value: your output is the right one by chance at the moment. This is the reason why you should use "order_date" instead of "customer_id".
WITH cte_product AS (
    SELECT sales.customer_id, 
           menu.product_name, 
           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) row_a
    FROM       sales
    INNER JOIN menu 
            ON sales.product_id = menu.product_id
)
SELECT *
FROM cte_product
WHERE row_a = 1

Another almost equally performant way of doing it, without subqueries this time, is using TOP(n). This will do is getting your top n values given the imposed ordering. Since we want all records that have row-number = 1, we can use TOP(1) WITH TIES, or translated, all records whose row-number is 1 (tied).

SELECT TOP(1) WITH TIES
       sales.customer_id, 
       menu.product_name
FROM       sales
INNER JOIN menu 
        ON sales.product_id = menu.product_id
ORDER BY ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date)

答案2

得分: 1

这是使用 group by 和聚合函数 min() 获取每位客户首次订单的另一种方法:

首先,你需要找到每位客户的首次订单日期:

select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id

然后,你需要将 sales 表与这个数据集连接以获取 product_id

select s.*
from sales s
inner join (
  select customer_id, min(order_date) as min_order_date
  from sales s
  group by customer_id
) as t on s.customer_id = t.customer_id and s.order_date = t.min_order_date;

接着:

select s.*, m.product_name
from sales s
inner join (
  select customer_id, min(order_date) as min_order_date
  from sales s
  group by customer_id
) as t on s.customer_id = t.customer_id and s.order_date = t.min_order_date
inner join menu m on m.product_id = s.product_id;

演示在这里

英文:

This is an other way to get first order per customer using group by and the aggregate function min() :

First you need to find first order date :

select customer_id, min(order_date) as min_order_date
from sales s
group by customer_id

Then you need to join sales table with this dataset to get product_id :

select s.*
from sales s
inner join (
  select customer_id, min(order_date) as min_order_date
  from sales s
  group by customer_id
) as t on s.customer_id = s.customer_id and s.order_date = t.min_order_date;

Then :

select s.*, m.product_name
from sales s
inner join (
  select customer_id, min(order_date) as min_order_date
  from sales s
  group by customer_id
) as t on s.customer_id = s.customer_id and s.order_date = t.min_order_date
inner join menu m on m.product_id = s.product_id;

Demo here

huangapple
  • 本文由 发表于 2023年5月21日 00:40:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76296287.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定