英文:
Dynamically generating date range starts in SQL
问题
假设您有一组日期。您希望任何日期在距离最早日期不超过X天的范围内都会被“合并”到该日期。然后,您希望重复此过程,直到合并所有日期点。
例如:
ID | DatePoints |
---|---|
1 | 2023-01-01 |
2 | 2023-01-02 |
3 | 2023-01-12 |
4 | 2023-01-21 |
5 | 2023-02-01 |
6 | 2023-02-02 |
7 | 2023-03-01 |
如果您使用X为10天的规则应用于这些数据,您将得到以下输出:
DateRangeStarts |
---|
2023-01-01 |
2023-01-12 |
2023-02-01 |
2023-03-01 |
ID 1和2进入范围1,ID 3和4进入范围2,ID 5和6进入范围3,ID 7进入范围4。
是否有一种方法可以在不使用循环的情况下执行此操作?答案可以在SQL Server或BigQuery中使用。谢谢。
英文:
Imagine you have a set of dates. You want any date which is within X days of the lowest date to be "merged" into that date. Then you want to repeat until you have merged all date points.
For example:
ID | DatePoints |
---|---|
1 | 2023-01-01 |
2 | 2023-01-02 |
3 | 2023-01-12 |
4 | 2023-01-21 |
5 | 2023-02-01 |
6 | 2023-02-02 |
7 | 2023-03-01 |
If you applied this rule to this data using 10 days as your X, you would end up with this output:
DateRangeStarts |
---|
2023-01-01 |
2023-01-12 |
2023-02-01 |
2023-03-01 |
IDs 1 and 2 into range 1, IDs 3 and 4 into range 2, IDs 5 and 6 into range 3, and ID 7 into range 4.
Is there any way to do this without a loop? Answer can work in SQL Server or BigQuery. Thanks
答案1
得分: 1
以下是翻译好的部分:
你可以考虑类似以下的方法。这并不是非常优雅,我也不太自信它是最佳解决方案,但我认为它是有效的。也许它是你开始工作的良好起点。
WITH cte AS
(
SELECT min(datepoint) datepoint
FROM test
UNION ALL
SELECT min(t.datepoint) OVER() datepoint
FROM test t CROSS APPLY (SELECT max(cte.datepoint) OVER() md FROM cte) c
WHERE t.datepoint > DATEADD(DAY, 10, c.md)
)
SELECT distinct datepoint
FROM cte
ORDER BY datepoint
(根据X天内的定义,你可能需要将 `>` 更改为 `>=`。)
基本思路是将你的表中的最小日期放入CTE中,然后递归地从你的表中获取大于CTE中当前最大日期 + X 天的最小日期。
由于SQL Server对递归CTE设置了一些限制,这变得有些混乱。它们不能在子查询中使用,也不能与常规的“OUTER JOIN”或聚合函数一起使用。因此,我使用“CROSS APPLY”和“min”/“max”的窗口版本。这可以得到正确的结果,但会多次出现,因此我不得不使用“DISTINCT”来清理它。
根据你的数据,也许最好还是使用循环,但我认为这是一个值得考虑的选项。
这里是它运行的[Fiddle][1]示例。
[1]: https://dbfiddle.uk/K5CKmJsx
英文:
You could consider something like the following. It's not pretty and I'm not at all confident it is the best solution, but I do think it works. Maybe it's a good starting point for you to work from.
WITH cte AS
(
SELECT min(datepoint) datepoint
FROM test
UNION ALL
SELECT min(t.datepoint) OVER() datepoint
FROM test t CROSS APPLY (SELECT max(cte.datepoint) OVER() md FROM cte) c
WHERE t.datepoint > DATEADD(DAY, 10, c.md)
)
SELECT distinct datepoint
FROM cte
ORDER BY datepoint
(You might want to change the >
to a >=
, depending on what counts as within X days.)
The basic idea is to get the minimum date from your table into the cte, then recursively get the minimum date from your table that is bigger than the current maximum date in the cte + X days.
It gets messy because of the limitations SQL Server places on recursive CTEs. They can't be used in subqueries, with normal OUTER JOIN
s, or with aggregate functions. Therefore, I use CROSS APPLY
and the window versions of min
/max
. This gets the correct result, but multiple times, so I'm forced to use DISTINCT
to clean it up afterward.
Depending on your data, it might be better to do a loop anyway, but I think this is an option to consider.
Here's a Fiddle of it working.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论