2023年3月7日 03:53:42go评论138阅读模式

英文:

calculate the 7 day sale by different start date (the first day customer purchase), to find average purchase unit of each customer by every 7 days

问题

From the table, I have the first table now, and trying to get every 7 day sales from the first day of this customer make the purchase. example is table 2

purchase date	customer id	sales unit
2018-01-01	1	10
2018-01-02	1	5
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	4
2018-01-21	2	5

purchase date	customer id	sales unit	every 7 day cumulative sales
2018-01-01	1	10	10
2018-01-02	1	5	15
2018-01-15	1	10	10
2018-01-05	2	3	3
2018-01-20	2	4	9
2018-01-21	2	5	9

The final table should be like this:

purchase week	customer id	7 day sales unit
2018-01-01	1	15
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	4

then I can calculate the average sales per customer

customer id	average of every 7 day sales unit	Calculation
1	12.5	(15+10) /2
2	3.5	(3+4) /2

the hard part is:

every customer has a different first day of purchase
The purchase dates are not consecutive, so I can't use unbounded or following 6 rows, etc.
There are 5 years in the whole dataset, so I can't manually subtract 7, 14, etc.
I tried to use date_trunc('week', date, min(date) over (partition by customerid))
also tried partition by rows between 6 preceding and current row. But the dates are not consecutive so it doesn't work.

英文:

From the table，I have the first table now, and trying to get every 7 day sales from the first day of this customer make the purchase. example is table 2

purchase date	customer id	sales unit
2018-01-01	1	10
2018-01-02	1	5
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	4
2018-01-21	2	5

purchase date	customer id	sales unit	every 7 day cumulative sales
2018-01-01	1	10	10
2018-01-02	1	5	15
2018-01-15	1	10	10
2018-01-05	2	3	3
2018-01-20	2	4	9
2018-01-21	2	5	9

The final table shud be like this:

purchase week	customer id	7 day sales unit
2018-01-01	1	15
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	4

then I can calculate the average sales per customer

customer id	average of every 7 day sales unit	Calculation
1	12.5	(15+10) /2
2	3.5	(3+4) /2

the hard part is:

every customer have different first day of purchase
The purchase date are not consequent, so I can't use unbonded or following 6 rows etc.
There're 5 years in the whole dataset, so I can't manually -7, -14, etc
I tried to use date_trunc('week',date, min(date) over (partition by customerid))
also tried parition by rows between 6 proceding and current row. But the date are not consequent so doesn't work

答案1

得分: 0

你可以使用SQL窗口函数的两个步骤来获得你想要的结果：

步骤1：按照每位顾客进行窗口分区，并获取每位顾客的首次购买日期。之后，使用Presto的date_diff()函数来计算从首次购买日期到当前购买日期的日期差，并除以7以获取从首次购买日期开始的周桶。

步骤2：按(customer, customer_sale_week_bucket)分区进行分组，获取销售单位的总和(sales_unit)和最小的购买日期(purchase_date)。

以下是查询：

with orders_with_customer_week_bucket AS 
(
  select
   purchase_date,
   customer_id,
   sales_unit,
   date_diff(day, min(purchase_date) over (partition by customer_id), purchase_date) / 7 as customer_sale_week_bucket
  from
   orders
 )
 select
     purchase_week,
     customer_id,
     seven_day_sales_unit
 from
     (select
         customer_id,
         customer_sale_week_bucket,
         min(purchase_date) as purchase_week,
         sum(sales_unit) as seven_day_sales_unit
     from
        orders_with_customer_week_bucket
     GROUP BY
        customer_id,
        customer_sale_week_bucket
     ) r

购买周	顾客ID	七天销售单位
2018-01-01	1	15
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	9

英文:

You can use SQL window function with 2 steps to get the result you want:

Step 1. Apply a window parition by per customer and get first_purchase_date for each customer. After that, use Presto date_diff() fucntion to calculate the date difference from first purchase date to current purchase date. Devide it by 7 to get week_bucket from the first date of purchase.

Step 2. group by per (customer, customer_sale_week_bucket) and get sum(sales_unit) and min(purchase_date) in each (customer, customer_sale_week_bucket) parition.

Here is the query:

with orders_with_customer_week_bucket AS 
(
  select
   purchase_date,
   customer_id,
   sales_unit,
   date_diff(day,min(purchase_date) over (partition by customer_id), purchase_date) / 7 as customer_sale_week_bucket
from
   orders
 )
 select
     purchase_week,
     customer_id,
     seven_day_sales_unit
 from
     (select
         customer_id,
         customer_sale_week_bucket,
         min(purchase_date) as purchase_week,
         sum(sales_unit) as seven_day_sales_unit
     from
        orders_with_customer_week_bucket
     GROUP BY
        customer_id,
        customer_sale_week_bucket
     )r

purchase_week	customer_id	seven_day_sales_unit
2018-01-01	1	15
2018-01-05	2	3
2018-01-15	1	10
2018-01-20	2	9

答案2

得分: 0

你可以使用一个case语句来根据日期进行操作。我在SQL Server中完成了这个操作，但我相信在Presto中也适用。我认为在Presto中DATEADD可能需要写成"Date_Add"（带引号）。

另外，你提到可能需要14天，所以我添加了一个用于存储天数的列。你可以看到，只需在DateAdd函数中更改天数即可。

FIDDLE

SELECT t1.purchaseDate,
       t1.CustomerID,
       t1.SalesUnit,
       SUM(CASE 
              WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -6, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
            END) AS SalesLast7,
       SUM(CASE 
              WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -13, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
            END) AS SalesLast14  
  FROM temp t1
LEFT JOIN temp t2 ON t1.customerID = t2.customerID AND t2.purchaseDate IS NOT NULL
 GROUP BY t1.purchaseDate, t1.customerID, t1.salesUnit

英文:

You can do it with a case statement looking at the dates. I did this in SQL Server but I believe it works in Presto. I think DATEADD might need to be "Date_Add" (with the quotes) in Presto.

Also you mentioned you may need 14 days so I added a column for that. You can see it's just a matter of changing the days in the DateAdd function.

FIDDLE

   SELECT t1.purchaseDate,
          t1.CustomerID,
          t1.SalesUnit,
          SUM(CASE 
                 WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -6, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
               END) AS SalesLast7,
          SUM(CASE 
                 WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -13, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
               END) AS SalesLast14  
     FROM temp t1
LEFT JOIN temp t2 ON t1.customerID = t2.customerID AND t2.purchaseDate IS NOT NULL
 GROUP BY t1.purchaseDate, t1.customerID, t1.salesUnit

purchaseDate	CustomerID	SalesUnit	SalesLast7	SalesLast14
2018-01-01	1	10	10	10
2018-01-02	1	5	15	15
2018-01-05	2	3	3	3
2018-01-15	1	10	10	15
2018-01-20	2	4	4	4

答案3

得分: 0

我不相信你的样本数据计算正确。对我来说不太清楚你是否真的需要每周的累积值，还是只需要直接的总和。我猜想你想要其中两种选项之一。

第一种选项：

with salesweek as (
    select *,
        -- 基于首次购买开始的重复7天计数的周数
        date_diff('day',
            min(purchasedate) over (partition by customerid),
            purchasedate) / 7 + 1 as weeknum -- 整数除法
    from T
)
select customerid, sum(salesunit) * 1.0 / count(distinct weeknum) as avg7daysales
from salesweek
group by customerid;

或者可能是第二种选项：

with firstpurchase as (
    select *,
        min(purchasedate) over (partition by customerid) as firstpurchasedate
    from T
), salesweek as (
    select *,
        -- 基于首次购买开始的重复7天计数的周数
        date_diff('day', firstpurchasedate, purchasedate) / 7 + 1 as weeknum
    from firstpurchase
), weighted as (
    select customerid, weeknum, purchasedate,
        sum(salesunit) * count(1) over (
            partition by customerid, weeknum
            order by purchasedate
            rows between current row and unbounded following
        ) as weightedsales
    from salesweek
    group by customerid, weeknum, purchasedate
)
select customerid, sum(weightedsales) * 1.0 / count(distinct weeknum)
from weighted
group by customerid
order by customerid;

这是在 SQL Server 中，因为我没有访问 Presto 的权限。

英文:

I don't believe your sample data is tallied right. It's not clear to me if you truly need the cumulative values per week or just a straight total. My guess is that you want something in one of these two options.

with salesweek as (
    select *,
        -- weeks based on repeating 7-day counter starting at first purchase
        date_diff(&#39;day&#39;,
            min(purchasedate) over (partition by customerid),
            purchasedate) / 7 + 1 as weeknum -- integer divide
    from T
)
select customerid, sum(salesunit) * 1.0 / count(distinct weeknum) as avg7daysales
from salesweek
group by customerid;

Or possibly:

with firstpurchase as (
    select *,
        min(purchasedate) over (partition by customerid) as firstpurchasedate
    from T
  ), salesweek as (
    select *,
        -- weeks based on repeating 7-day counter starting at first purchase
        date_diff(&#39;day&#39;, firstpurchasedate, purchasedate) / 7 + 1 as weeknum
    from firstpurchase
  ), weighted as (
    select customerid, weeknum, purchasedate,
        sum(salesunit) * count(1) over (
            partition by customerid, weeknum
            order by purchasedate
            rows between current row and unbounded following
        ) as weightedsales
    from salesweek
    group by customerid, weeknum, purchasedate
)
select customerid, sum(weightedsales) * 1.0 / count(distinct weeknum)
from weighted
group by customerid
order by customerid;

https://dbfiddle.uk/sgOOd07H

This is in SQL Server since I don't have access to Presto.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

calculate the 7 day sale by different start date (the first day customer purchase), to find average purchase unit of each customer by every 7 days

问题

答案1

答案2

答案3

获取时间中的分钟的T-SQL语句

使用WHERE子句和在SELECT子句中使用JOIN的区别

为什么将“where”子句移到“join”子句会产生不同的结果。

需要使用pgrouting以经纬度形式提供的路径。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论