英文:
Determine total number of aggregate weeks excluding overlap periods
问题
我需要计算总共有多少个不重叠的周数,但要排除重叠的日期范围。我尝试过在表上进行自连接,但无法弄清楚。
所需输出:
期望的结果集将是一个新的结果集(具有连续的日期范围),排除日期重叠。例如,在上面的示例中,我想要删除从2018-09-08到2018-09-11的日期,因为这4天已经被上面的行覆盖。因此,它应该是根据时间顺序显示的所有行,具有合并的fromdate/todate。正如您所看到的,日期之间存在间隙,我不能使用Min(fromdate) | Max(todate)。
number | name | fromdate | todate | daycount |
---|---|---|---|---|
887 | DAV | 30/05/2018 | 29/05/2019 | 364 |
887 | DAV | 24/07/2019 | 07/08/2019 | 14 |
887 | DAV | 09/08/2019 | 10/08/2019 | 1 |
887 | DAV | 18/09/2019 | 17/11/2019 | 60 |
887 | DAV | 09/12/2019 | 17/02/2020 | 70 |
887 | DAV | 30/03/2020 | 30/03/2020 | 0 |
887 | DAV | 03/11/2021 | 02/03/2022 | 119 |
887 | DAV | 03/03/2022 | 30/03/2022 | 27 |
887 | DAV | 31/03/2022 | 05/07/2023 | 461 |
感谢您的帮助。
英文:
I have query which returns thousands of rows, with a number and name, fromdate and today. I need to calculate the total number of aggregate weeks but exclude the overlapping date range. I have not been able to find a way.
As you can see, the 7th row of data is 2018-09-12 and next starting date is 2018-09-08. All of these rows is only for one number, but in other number columns, there are a lot more overlaps but example is the same. I have to exclude these overlaps. e.g. in another example, row one is the whole month of January, and row two is a week from 10 to 17th in it. so this is another example.
I tried to do a self join on the table but could not figure it out.
select *
from #RW r1, #RW r2
where r1.number = r2.number and r1.name = r2.name
and r1.Fromdate < r2.Fromdate and r1.todate > r2.Fromdate
and r1.number = '887'
**NOTE: If daycount = 0 , then that row should be ignored. it was generated based on another condition before I exported.
number | name | fromdate | todate | daycount |
---|---|---|---|---|
887 | DAV | NULL | NULL | 0 |
887 | DAV | 30/05/2018 | 05/06/2018 | 6 |
887 | DAV | 05/06/2018 | 19/06/2018 | 14 |
887 | DAV | 19/06/2018 | 18/07/2018 | 29 |
887 | DAV | 18/07/2018 | 15/08/2018 | 28 |
887 | DAV | 15/08/2018 | 12/09/2018 | 28 |
887 | DAV | 08/09/2018 | 17/10/2018 | 0 |
887 | DAV | 17/10/2018 | 07/11/2018 | 0 |
887 | DAV | 28/11/2018 | 09/01/2019 | 42 |
887 | DAV | 09/01/2019 | 05/02/2019 | 27 |
887 | DAV | 06/02/2019 | 29/05/2019 | 112 |
887 | DAV | 24/07/2019 | 07/08/2019 | 0 |
887 | DAV | 09/08/2019 | 09/08/2019 | 0 |
887 | DAV | 10/08/2019 | 10/08/2019 | 0 |
887 | DAV | 18/09/2019 | 17/11/2019 | 60 |
887 | DAV | 09/12/2019 | 20/01/2020 | 42 |
887 | DAV | 20/01/2020 | 17/02/2020 | 28 |
887 | DAV | 30/03/2020 | 30/03/2020 | 0 |
887 | DAV | 03/11/2021 | 10/11/2021 | 7 |
887 | DAV | 11/11/2021 | 14/12/2021 | 33 |
887 | DAV | 15/12/2021 | 31/01/2022 | 47 |
887 | DAV | 01/02/2022 | 27/02/2022 | 26 |
887 | DAV | 28/02/2022 | 02/03/2022 | 2 |
887 | DAV | 03/03/2022 | 30/03/2022 | 27 |
887 | DAV | 31/03/2022 | 12/04/2022 | 12 |
887 | DAV | 13/04/2022 | 27/04/2022 | 14 |
887 | DAV | 28/04/2022 | 24/05/2022 | 26 |
887 | DAV | 25/05/2022 | 22/06/2022 | 28 |
887 | DAV | 23/06/2022 | 20/07/2022 | 27 |
887 | DAV | 21/07/2022 | 16/08/2022 | 26 |
887 | DAV | 17/08/2022 | 14/09/2022 | 28 |
887 | DAV | 15/09/2022 | 09/11/2022 | 55 |
887 | DAV | 10/11/2022 | 07/12/2022 | 27 |
887 | DAV | 08/12/2022 | 11/01/2023 | 34 |
887 | DAV | 12/01/2023 | 05/02/2023 | 24 |
887 | DAV | 06/02/2023 | 21/02/2023 | 15 |
887 | DAV | 22/02/2023 | 23/03/2023 | 29 |
887 | DAV | 24/03/2023 | 08/05/2023 | 45 |
887 | DAV | 09/05/2023 | 05/06/2023 | 27 |
887 | DAV | 05/06/2023 | 05/07/2023 | 30 |
Desired output:
the desired result set would be to have a new result set (with consecutive date ranges) excluding the date overlaps. e.g. in above example, I want to remove the dates from 2018-09-08 until 2018-09-11 as these 4 days are already covered by the row above it. So naturally it should be all rows showing based on chronologica order, with merged fromdate/todate. as you can see, there are gaps between dates and I can't go with Min(fromdate) | Max(todate).
number | name | fromdate | todate | daycount |
---|---|---|---|---|
887 | DAV | 30/05/2018 | 29/05/2019 | 364 |
887 | DAV | 24/07/2019 | 07/08/2019 | 14 |
887 | DAV | 09/08/2019 | 10/08/2019 | 1 |
887 | DAV | 18/09/2019 | 17/11/2019 | 60 |
887 | DAV | 09/12/2019 | 17/02/2020 | 70 |
887 | DAV | 30/03/2020 | 30/03/2020 | 0 |
887 | DAV | 03/11/2021 | 02/03/2022 | 119 |
887 | DAV | 03/03/2022 | 30/03/2022 | 27 |
887 | DAV | 31/03/2022 | 05/07/2023 | 461 |
Thanks in advance for your help,
答案1
得分: 1
这是我的递归公共表达式(CTE)解决方案:
;WITH cteGeneratedDates AS (
SELECT
number,
[name],
fromdate,
todate
FROM TableName
WHERE fromdate IS NOT NULL
AND daycount > 0 -- 如果应计算0天计数,请注释掉此行
UNION ALL
SELECT
number,
[name],
fromdate = DATEADD(DAY, 1, fromdate),
todate
FROM cteGeneratedDates
WHERE DATEADD(DAY, 1, fromdate) <= todate
),
cteGroupedConsecutiveDates AS (
SELECT
number,
[name],
fromdate,
ROW_NUMBER() OVER (PARTITION BY number, [name] ORDER BY fromdate) todate
FROM cteGeneratedDates
GROUP BY number, [name], fromdate
)
SELECT
number,
[name],
MIN(fromdate) fromdate,
MAX(fromdate) todate,
DATEDIFF(DAY, MIN(fromdate), MAX(fromdate))
FROM cteGroupedConsecutiveDates
GROUP BY
number,
[name],
DATEDIFF(DAY, fromdate, todate)
ORDER BY fromdate
OPTION (MAXRECURSION 2000)
尽管与您的预期输出有所不同,原因如下:为什么这些组没有在同一行下分组?
- 03/11/2021 - 02/03/2022
- 03/03/2022 - 30/03/2022
- 31/03/2022 - 05/07/2023
在我的解决方案中,上述范围将在同一组中。
此外,07/11/2018和28/11/2018之间没有日期,您是否认为这个组需要拆分?30/05/2018 - 29/05/2019。
英文:
Here's my solution with recursive CTE
;WITH cteGeneratedDates AS (
SELECT
number,
[name],
fromdate,
todate
FROM TableName
WHERE fromdate IS NOT NULL
AND daycount > 0 -- comment this out if 0 daycount should be counted
UNION ALL
SELECT
number,
[name],
fromdate = DATEADD(DAY, 1, fromdate),
todate
FROM cteGeneratedDates
WHERE DATEADD(DAY, 1, fromdate) <= todate
),
cteGroupedConsecutiveDates AS (
SELECT
number,
[name],
fromdate,
ROW_NUMBER() OVER (PARTITION BY number, [name] ORDER BY fromdate) todate
FROM cteGeneratedDates
GROUP BY number, [name], fromdate
)
SELECT
number,
[name],
MIN(fromdate) fromdate,
MAX(fromdate) todate,
DATEDIFF(DAY, MIN(fromdate), MAX(fromdate))
FROM cteGroupedConsecutiveDates
GROUP BY
number,
[name],
DATEDIFF(DAY, fromdate, todate)
ORDER BY fromdate
OPTION (MAXRECURSION 2000)
Though there's difference with your expected output because of the following: why do these groups not grouped under 1 row?
- 03/11/2021 - 02/03/2022
- 03/03/2022 - 30/03/2022
- 31/03/2022 - 05/07/2023
In my solution, above ranges would be in the same group.
Also, there are no dates between 07/11/2018 and 28/11/2018, don't you think this group needs to be broken down? 30/05/2018 - 29/05/2019
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论