2023年5月22日 23:04:41go评论85阅读模式

英文:

How to do partition by using date intervals in SQL

问题

以下是翻译好的代码部分：

case when
interval_start - lag(interval_end) over (partition by account_id order by interval_start) as prev_interval_end > 30 then 1 else 0 end

请注意，这只是原始代码的一部分，您需要将其与其他查询一起使用，以便正确计算“reactivated”列中的值。如果您需要更多帮助，请提供完整的查询或具体的问题描述。

英文:

I have the following sample data with 3 columns, i.e., account_id, interval_start, interval_end. And I want to create a reactivated column with the given output in the following table.

For each account_id, reactivated column should start with a 0 value and then I want to check if the previous interval_end and current interval_start value is more than a month then I want to increment that value as in to 1 and then next time when previous interval_end and current interval_start is more than a month, then it should be 2 and so on.

account_id	interval_start	interval_end	reactivated
ABC	2015-01-01 00:00:00.000	2015-02-01 00:00:00.000	0
ABC	2015-02-01 00:00:00.000	2015-03-01 00:00:00.000	0
ABC	2015-03-01 00:00:00.000	2015-04-01 00:00:00.000	0
ABC	2015-09-01 00:00:00.000	2015-10-01 00:00:00.000	1
ABC	2015-10-01 00:00:00.000	2015-11-01 00:00:00.000	1
ABC	2017-04-01 00:00:00.000	2017-05-01 00:00:00.000	2
ABC	2017-05-01 00:00:00.000	2017-06-01 00:00:00.000	2
ABC	2017-06-01 00:00:00.000	2017-07-01 00:00:00.000	2
ABC	2017-07-01 00:00:00.000	2017-08-01 00:00:00.000	2
ABC	2019-03-01 00:00:00.000	2019-04-01 00:00:00.000	3
ABC	2019-04-01 00:00:00.000	2019-05-01 00:00:00.000	3
ABC	2019-05-01 00:00:00.000	2019-06-01 00:00:00.000	3
ABC	2021-07-01 00:00:00.000	2021-08-01 00:00:00.000	4
ABC	2021-08-01 00:00:00.000	2021-09-01 00:00:00.000	4
DEF	2015-03-01 00:00:00.000	2015-04-01 00:00:00.000	0
DEF	2015-04-01 00:00:00.000	2015-05-01 00:00:00.000	0
DEF	2017-06-01 00:00:00.000	2017-07-01 00:00:00.000	1
DEF	2017-07-01 00:00:00.000	2017-08-01 00:00:00.000	1
DEF	2019-05-01 00:00:00.000	2019-06-01 00:00:00.000	2

So far I'm only able to get 0s and 1s in the reactivated column by doing the following but its not what I want:

case when
interval_start - lag(interval_end) over (partition by account_id order by interval_start) as prev_interval_end &gt; 30 then 1 else 0 end

How do I get the value to increment as well for each account_id please.

答案1

得分: 1

Here's the translated code portion:

你想要一个连续重叠行的运行计数。
使用`lag()`你已经走在正确的轨道上；你只需要再加一层窗口函数来对这些标志进行`sum`操作：
select t.*,
    sum(case when interval_start > lag_interval_end then 1 else 0 end) grp
from (
    select t.*,
        lag(interval_end) over (partition by account_id order by interval_start) as lag_interval_end
    from mytable t
) t
这是标准的SQL，在大多数支持窗口函数的数据库上都可以工作。根据你的实际数据库，可能有更简洁的方式来表达窗口的求和，例如在Postgres中：
count(*) filter(where interval_start > lag_interval_end) grp
或者在MySQL中：
sum(interval_start > lag_interval_end) grp
在Redshift中，我们需要一个`rows`子句：
sum(case when interval_start > lag_interval_end then 1 else 0 end) over (
    partition by account_id 
    order by interval_start 
    rows unbounded preceding
) as  grp

Please note that the code is now translated into Chinese, as per your request.

英文:

You want a running count of consecutive rows that overlap.

You are on the right track with lag(); you just need another level of windowing to sum those flags:

select t.*,
    sum(case when interval_start &gt; lag_interval_end then 1 else 0 end) grp
from (
    select t.*,
        lag(interval_end) over (partition by account_id order by interval_start) as lag_interval_end
    from mytable t
) t

This is standard SQL, that should work on most databases that support window functions. Depending on your actual database, there might be neater ways to express the window sum, eg in Postgres:

count(*) filter(where interval_start &gt; lag_interval_end) grp

Or in MySQL:

sum(interval_start &gt; lag_interval_end) grp

In Redshift, we need a rows clause:

sum(case when interval_start &gt; lag_interval_end then 1 else 0 end) over (
    partition by account_id 
    order by interval_start 
    rows unbounded preceding
) as  grp

答案2

得分: 0

在上面的答案中添加。由于我使用Redshift，对我有效的是：

sum(case when interval_start > lag_interval_end then 1 else 0 end) over (partition by account_id order by interval_start rows unbounded preceding) as grp

英文:

Adding to above answer. Since I'm using Redshift what worked for me is:

sum(case when interval_start &gt; lag_interval_end then 1 else 0 end) over (partition by account_id order by interval_start rows unbounded preceding) as  grp

答案3

得分: 0

以下是您要翻译的内容：

解决方案 PostgreSQL

带有示例数据的查询

with dt as (
select * from (values
('ABC', DATE'2015-01-01'::timestamp,DATE'2015-02-01'::timestamp),
('ABC', DATE'2015-03-01',DATE'2015-04-01'),
('ABC', DATE'2015-09-01',DATE'2015-10-01'),
('ABC', DATE'2015-10-01',DATE'2015-11-01'),
('ABC', DATE'2017-04-01',DATE'2017-05-01'),
('ABC', DATE'2017-05-01',DATE'2017-06-01'),
('XXX', DATE'2015-01-01',DATE'2015-02-01')
) t(account_id, interval_start, interval_end)),
dt2 as (
select 
  account_id, interval_start, interval_end,
  case 
   interval_start - lag(interval_end) over(partition by account_id order by interval_start) >= interval '1 month'
  when true then 1
  else 0 end delta_act
from  dt)
select 
 account_id, interval_start, interval_end, delta_act,
 sum(delta_act) over(partition by account_id order by interval_start) reactivated
from dt2
order by 1,2

子查询 dt2 根据您的一个月不活动规则计算 delta activation。

case 
   interval_start - lag(interval_end) over(partition by account_id order by interval_start) >= interval '1 month'
  when true then 1
  else 0 end delta_act

最后，您只需使用窗口函数 sum 这个 delta。

sum(delta_act) over(partition by account_id order by interval_start) reactivated

结果

account_id|interval_start     |interval_end       |delta_act|reactivated|
----------+-------------------+-------------------+---------+-----------+
ABC       |2015-01-01 00:00:00|2015-02-01 00:00:00|        0|          0|
ABC       |2015-03-01 00:00:00|2015-04-01 00:00:00|        0|          0|
ABC       |2015-09-01 00:00:00|2015-10-01 00:00:00|        1|          1|
ABC       |2015-10-01 00:00:00|2015-11-01 00:00:00|        0|          1|
ABC       |2017-04-01 00:00:00|2017-05-01 00:00:00|        1|          2|
ABC       |2017-05-01 00:00:00|2017-06-01 00:00:00|        0|          2|
XXX       |2015-01-01 00:00:00|2015-02-01 00:00:00|        0|          0|

请注意，这是您提供的查询和结果的中文翻译。

英文:

Solution for PostgreSQL

Query with sample data

with dt as (
select * from (values
(&#39;ABC&#39;, DATE&#39;2015-01-01&#39;::timestamp,DATE&#39;2015-02-01&#39;::timestamp),
(&#39;ABC&#39;, DATE&#39;2015-03-01&#39;,DATE&#39;2015-04-01&#39;),
(&#39;ABC&#39;, DATE&#39;2015-09-01&#39;,DATE&#39;2015-10-01&#39;),
(&#39;ABC&#39;, DATE&#39;2015-10-01&#39;,DATE&#39;2015-11-01&#39;),
(&#39;ABC&#39;, DATE&#39;2017-04-01&#39;,DATE&#39;2017-05-01&#39;),
(&#39;ABC&#39;, DATE&#39;2017-05-01&#39;,DATE&#39;2017-06-01&#39;),
(&#39;XXX&#39;, DATE&#39;2015-01-01&#39;,DATE&#39;2015-02-01&#39;)
) t(account_id, interval_start, interval_end)),
dt2 as (
select 
  account_id, interval_start, interval_end,
  case 
   interval_start - lag(interval_end) over(partition by account_id order by interval_start) &gt;= interval &#39;1 month&#39;
  when true then 1
  else 0 end delta_act
from  dt)
select 
 account_id, interval_start, interval_end, delta_act,
 sum(delta_act) over(partition by account_id order by interval_start) reactivated
from dt2
order by 1,2

The subquery dt2 calculate the delta aktivation based on you one month inactivity rule.

  case 
   interval_start - lag(interval_end) over(partition by account_id order by interval_start) &gt;= interval &#39;1 month&#39;
  when true then 1
  else 0 end delta_act

Finally you must only sum this delta using the window function

 sum(delta_act) over(partition by account_id order by interval_start) reactivated

Result

account_id|interval_start     |interval_end       |delta_act|reactivated|
----------+-------------------+-------------------+---------+-----------+
ABC       |2015-01-01 00:00:00|2015-02-01 00:00:00|        0|          0|
ABC       |2015-03-01 00:00:00|2015-04-01 00:00:00|        0|          0|
ABC       |2015-09-01 00:00:00|2015-10-01 00:00:00|        1|          1|
ABC       |2015-10-01 00:00:00|2015-11-01 00:00:00|        0|          1|
ABC       |2017-04-01 00:00:00|2017-05-01 00:00:00|        1|          2|
ABC       |2017-05-01 00:00:00|2017-06-01 00:00:00|        0|          2|
XXX       |2015-01-01 00:00:00|2015-02-01 00:00:00|        0|          0|

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在SQL中使用日期间隔进行分区

问题

答案1

答案2

答案3

REGEXP 用于字符串格式化，以空格分隔字符和数字进行分组。

我如何在SQL中获取两个日期之间每周的总订单数量？

SQL查询在VBA中执行时，粘贴时丢失小数位。

选择字段，除非另一个字段包含重复。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。