动态生成 SQL 中的日期范围起始日期

huangapple go评论65阅读模式
英文:

Dynamically generating date range starts in SQL

问题

假设您有一组日期。您希望任何日期在距离最早日期不超过X天的范围内都会被“合并”到该日期。然后,您希望重复此过程,直到合并所有日期点。

例如:

ID DatePoints
1 2023-01-01
2 2023-01-02
3 2023-01-12
4 2023-01-21
5 2023-02-01
6 2023-02-02
7 2023-03-01

如果您使用X为10天的规则应用于这些数据,您将得到以下输出:

DateRangeStarts
2023-01-01
2023-01-12
2023-02-01
2023-03-01

ID 1和2进入范围1,ID 3和4进入范围2,ID 5和6进入范围3,ID 7进入范围4。

是否有一种方法可以在不使用循环的情况下执行此操作?答案可以在SQL Server或BigQuery中使用。谢谢。

英文:

Imagine you have a set of dates. You want any date which is within X days of the lowest date to be "merged" into that date. Then you want to repeat until you have merged all date points.

For example:

ID DatePoints
1 2023-01-01
2 2023-01-02
3 2023-01-12
4 2023-01-21
5 2023-02-01
6 2023-02-02
7 2023-03-01

If you applied this rule to this data using 10 days as your X, you would end up with this output:

DateRangeStarts
2023-01-01
2023-01-12
2023-02-01
2023-03-01

IDs 1 and 2 into range 1, IDs 3 and 4 into range 2, IDs 5 and 6 into range 3, and ID 7 into range 4.

Is there any way to do this without a loop? Answer can work in SQL Server or BigQuery. Thanks

答案1

得分: 1

以下是翻译好的部分:

你可以考虑类似以下的方法。这并不是非常优雅,我也不太自信它是最佳解决方案,但我认为它是有效的。也许它是你开始工作的良好起点。

WITH cte AS
(
  SELECT min(datepoint) datepoint
  FROM test
  UNION ALL
  SELECT min(t.datepoint) OVER() datepoint
  FROM test t CROSS APPLY (SELECT max(cte.datepoint) OVER() md FROM cte) c
  WHERE t.datepoint > DATEADD(DAY, 10, c.md)
)
SELECT distinct datepoint
FROM cte
ORDER BY datepoint

(根据X天内的定义,你可能需要将 `>` 更改为 `>=`。)

基本思路是将你的表中的最小日期放入CTE中,然后递归地从你的表中获取大于CTE中当前最大日期 + X 天的最小日期。

由于SQL Server对递归CTE设置了一些限制,这变得有些混乱。它们不能在子查询中使用,也不能与常规的“OUTER JOIN”或聚合函数一起使用。因此,我使用“CROSS APPLY”和“min/max”的窗口版本。这可以得到正确的结果,但会多次出现,因此我不得不使用“DISTINCT”来清理它。

根据你的数据,也许最好还是使用循环,但我认为这是一个值得考虑的选项。

这里是它运行的[Fiddle][1]示例。


  [1]: https://dbfiddle.uk/K5CKmJsx
英文:

You could consider something like the following. It's not pretty and I'm not at all confident it is the best solution, but I do think it works. Maybe it's a good starting point for you to work from.

WITH cte AS
(
  SELECT min(datepoint) datepoint
  FROM test
  UNION ALL
  SELECT min(t.datepoint) OVER() datepoint
  FROM test t CROSS APPLY (SELECT max(cte.datepoint) OVER() md FROM cte) c
  WHERE t.datepoint > DATEADD(DAY, 10, c.md)
)
SELECT distinct datepoint
FROM cte
ORDER BY datepoint

(You might want to change the > to a >=, depending on what counts as within X days.)

The basic idea is to get the minimum date from your table into the cte, then recursively get the minimum date from your table that is bigger than the current maximum date in the cte + X days.

It gets messy because of the limitations SQL Server places on recursive CTEs. They can't be used in subqueries, with normal OUTER JOINs, or with aggregate functions. Therefore, I use CROSS APPLY and the window versions of min/max. This gets the correct result, but multiple times, so I'm forced to use DISTINCT to clean it up afterward.

Depending on your data, it might be better to do a loop anyway, but I think this is an option to consider.

Here's a Fiddle of it working.

huangapple
  • 本文由 发表于 2023年2月10日 06:34:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75405147.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定