英文:
Oracle-How to exclude several non-overlapping date period from a certain period?
问题
以下是代码的翻译部分:
With
RawTable as (
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-11','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-18','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-20','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-22','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-01-10','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-02-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-27','yyyy-mm-dd') ExcludeEDate
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-01-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-01-19','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'C' ColA,'CC' ColB,
To_Date('2023-02-09','yyyy-mm-dd') START_DATE,To_Date('2023-02-12','yyyy-mm-dd') END_DATE,
To_Date('2023-02-08','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'C' ColA,'CC' ColB,
To_Date('2023-02-09','yyyy-mm-dd') START_DATE,To_Date('2023-02-12','yyyy-mm-dd') END_DATE,
To_Date('2023-01-28','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-02','yyyy-mm-dd') EXCLUDE_EDATE
from dual
)
Select * from RawTable
请注意,这是原始代码的翻译,仅包括代码部分,没有额外的内容。
英文:
I have a sample data as below
With
RawTable as (
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-11','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-18','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'A' ColA,'AA' ColB,
To_Date('2023-02-10','yyyy-mm-dd') START_DATE,To_Date('2023-02-23','yyyy-mm-dd') END_DATE,
To_Date('2023-02-20','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-22','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-01-10','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-02-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-27','yyyy-mm-dd') ExcludeEDate
from dual union all
Select 'B' ColA,'BB' ColB,
To_Date('2023-02-01','yyyy-mm-dd') START_DATE,To_Date('2023-02-20','yyyy-mm-dd') END_DATE,
To_Date('2023-01-15','yyyy-mm-dd') Exclude_SDate,To_Date('2023-01-19','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'C' ColA,'CC' ColB,
To_Date('2023-02-09','yyyy-mm-dd') START_DATE,To_Date('2023-02-12','yyyy-mm-dd') END_DATE,
To_Date('2023-02-08','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-13','yyyy-mm-dd') EXCLUDE_EDATE
from dual union all
Select 'C' ColA,'CC' ColB,
To_Date('2023-02-09','yyyy-mm-dd') START_DATE,To_Date('2023-02-12','yyyy-mm-dd') END_DATE,
To_Date('2023-01-28','yyyy-mm-dd') Exclude_SDate,To_Date('2023-02-02','yyyy-mm-dd') EXCLUDE_EDATE
from dual
)
Select * from RawTable
RawData
https://i.stack.imgur.com/vtZE4.jpg
I’d like to know how to remove non-overlapping date segment of ExcludeSDate~ExcludeEDate from StartDate~EndDate for each group of ColA and ColB.
Date span of StartDate~EndDate may be sliced into several segments.
And if ExcludeSDate~ExcludeEDate contain SartDate~EndDate,then final result need to exclude this group.
Each segment of ExcludeSDate~ExcludeEDtae does not overlap.
For each group of ColA and ColB, StartDate and EndDate are the same for each row of this group.
I try to list each possibilities using case when clause, but the hard part is how to group by ColA and ColB to give correct result.
The desired result is as follows:
Desired result
https://i.stack.imgur.com/r1KBI.jpg
I tried to use case when expression in select clause but stuck in the problem of grouping ColA and ColB to give correct date span.
2023-02-27 Edit
To make Column names more readable, I change StartDate to Start_Date, EndDate to End_Date, ExlcudeSDate to Exclude_SDate,ExlcudeEdate to Exclude_EDate.
(SDate shorthands for Start_Date,EDate for End_Date)
Thanks dr's answer, it gives me lots of insight by making Exclude_EDate as the new Start_Date if Exclude_EDate cut between Start_Date and End_Date,then take Exclude_SDate of next record as new End_Date to complete this question.
This deals End time perfectly, but I found for each group of ColA and ColB,if the 1st record of Exclucde_SDate is between Start_Date and End_Date, then the output will miss a period, ie. from 1st record Start_Date to Exlcude_SDate.
For example if first row of Exclude_SDate changes to 2023-02-11,then a period of 2023-02-10 to 2023-02-11 is missing.
I add some queries to union dr's answer by using Lag Function to check if current row is the first row and if it is,then further check if Exclude_SDate is between Start_Date and End_Date.
Here's the final query.
Select COLA, COLB,START_DATE_2 Start_Date,END_DATE_2 END_DATE
from (
Select COLA, COLB, START_DATE, END_DATE, Exclude_SDate, EXCLUDE_EDATE,
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE
THEN EXCLUDE_EDATE END "START_DATE_2",
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE THEN
CASE WHEN LEAD(COLA) OVER(Order By COLA, START_DATE) = COLA And
LEAD(COLB) OVER(Order By COLA, COLB, START_DATE) = COLB
THEN LEAD(Exclude_SDate) OVER(Order By COLA, COLB, START_DATE)
ELSE END_DATE
END
END "END_DATE_2"
From rawtable
--below:deal with problem that initial Exclude_SDate cut between Start_Date and End_Date
Union all
Select R.COLA, R.COLB, R.START_DATE, R.END_DATE, R.Exclude_SDate, R.EXCLUDE_EDATE,
Case When Lag(R.ColA,1,null) over(partition by R.ColA Order by R.ColA,R.START_DATE) is null And
Lag(R.ColB,1,null) over(partition by R.ColA,R.ColB Order by R.ColA,R.ColB,R.START_DATE) is null And
R.Exclude_SDate between R.Start_Date and R.End_Date
then R.Start_Date
Else Null
End "START_DATE_2",
Case When Lag(R.ColA,1,null) over(partition by R.ColA Order by R.ColA,R.START_DATE) is null And
Lag(R.ColB,1,null) over(partition by R.ColA,R.ColB Order by R.ColA,R.ColB,R.START_DATE) is null And
R.Exclude_SDate between R.Start_Date and R.End_Date
then R.Exclude_SDate
Else Null
End "END_DATE_2"
From RawTable R
)
Where Start_Date_2 is not null
Order by ColA,ColB,Start_Date_2
答案1
得分: 1
尝试在这里使用CASE表达式与LEAD() OVER()分析函数:
SELECT COL_A, COL_B, START_DATE_2 "START_DATE", END_DATE_2 "END_DATE"
FROM ( Select
COL_A, COL_B, START_DATE, END_DATE, EXCLUDES_DATE, EXCLUDE_EDATE,
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE THEN EXCLUDE_EDATE END "START_DATE_2",
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE THEN
CASE WHEN LEAD(COL_A) OVER(Order By COL_A, START_DATE) = COL_A And
LEAD(COL_B) OVER(Order By COL_A, COL_B, START_DATE) = COL_B
THEN LEAD(EXCLUDES_DATE) OVER(Order By COL_A, COL_B, START_DATE)
ELSE END_DATE
END
END "END_DATE_2"
From
rawtable
Order By COL_A, COL_B, START_DATE )
Where
START_DATE_2 Is Not Null
使用您提供的示例数据,结果应为:
|COL_A |COL_B |START_DATE |END_DATE |
|------ |---------- |--------------- |------------|
|A |AA |13-FEB-23 |15-FEB-23 |
|A |AA |18-FEB-23 |20-FEB-23 |
|A |AA |22-FEB-23 |23-FEB-23 |
|B |BB |13-FEB-23 |15-FEB-23 |
英文:
Try to use CASE expression with LEAD() OVER() anlytic function like here:
SELECT COL_A, COL_B, START_DATE_2 "START_DATE", END_DATE_2 "END_DATE"
FROM ( Select
COL_A, COL_B, START_DATE, END_DATE, EXCLUDES_DATE, EXCLUDE_EDATE,
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE THEN EXCLUDE_EDATE END "START_DATE_2",
CASE WHEN EXCLUDE_EDATE BETWEEN START_DATE And END_DATE THEN
CASE WHEN LEAD(COL_A) OVER(Order By COL_A, START_DATE) = COL_A And
LEAD(COL_B) OVER(Order By COL_A, COL_B, START_DATE) = COL_B
THEN LEAD(EXCLUDES_DATE) OVER(Order By COL_A, COL_B, START_DATE)
ELSE END_DATE
END
END "END_DATE_2"
From
rawtable
Order By COL_A, COL_B, START_DATE )
Where
START_DATE_2 Is Not Null
With your sample data:
COL_A | COL_B | START_DATE | END_DATE | EXCLUDES_DATE | EXCLUDE_EDATE |
---|---|---|---|---|---|
A | AA | 10-FEB-23 | 23-FEB-23 | 01-FEB-23 | 13-FEB-23 |
A | AA | 10-FEB-23 | 23-FEB-23 | 15-FEB-23 | 18-FEB-23 |
A | AA | 10-FEB-23 | 23-FEB-23 | 20-FEB-23 | 22-FEB-23 |
B | BB | 01-FEB-23 | 20-FEB-23 | 10-JAN-23 | 13-FEB-23 |
B | BB | 01-FEB-23 | 20-FEB-23 | 15-FEB-23 | 27-FEB-23 |
B | BB | 01-FEB-23 | 20-FEB-23 | 15-JAN-23 | 19-JAN-23 |
C | CC | 09-FEB-23 | 12-FEB-23 | 08-FEB-23 | 13-FEB-23 |
C | CC | 09-FEB-23 | 12-FEB-23 | 28-JAN-23 | 02-FEB-23 |
... the result should be:
COL_A | COL_B | START_DATE | END_DATE |
---|---|---|---|
A | AA | 13-FEB-23 | 15-FEB-23 |
A | AA | 18-FEB-23 | 20-FEB-23 |
A | AA | 22-FEB-23 | 23-FEB-23 |
B | BB | 13-FEB-23 | 15-FEB-23 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论