英文:
Find only the time periods on a schedule where the time of employees overlap. (AKA those times where everyone is present)
问题
我有一个雇员名单,列出了他们的工作时间,每个雇员的工作时间不同,我需要找出如何确定只有在所有雇员都同时在场的时间段。例如,我需要截断仅有1名或几名雇员在场但不是所有人的时间段。
我尝试使用一个lag函数来确定重叠的时间段,以便适当地截断时间,但似乎没有得到我寻找的结果。
预期的结果类似于:
sDate | startTime | endTime | Duration |
---|---|---|---|
2023-02-23 00:00:00 | 2023-02-23 08:30:00 | 2023-02-23 09:00:00 | 30 |
2023-02-23 00:00:00 | 2023-02-23 09:30:00 | 2023-02-23 10:00:00 | 30 |
2023-02-23 00:00:00 | 2023-02-23 10:30:00 | 2023-02-23 12:00:00 | 90 |
2023-02-23 00:00:00 | 2023-02-23 13:00:00 | 2023-02-23 17:00:00 | 240 |
英文:
I have a list of employees with times they are working and each employee has differing times they are on the job and I need to figure out how to determine ONLY the times where everyone is present at the same time. For instance I need to truncate times where only 1 or a few employees are present but not all.
**sDate | startTime | endTime | name** |
---|---|---|---|
2023-02-23 00:00:00.000 | 2023-02-23 08:00:00.000 | 2023-02-23 10:00:00.000 | John |
2023-02-23 00:00:00.000 | 2023-02-23 10:30:00.000 | 2023-02-23 12:00:00.000 | John |
2023-02-23 00:00:00.000 | 2023-02-23 13:00:00.000 | 2023-02-23 17:00:00.000 | John |
2023-02-23 00:00:00.000 | 2023-02-23 08:30:00.000 | 2023-02-23 09:00:00.000 | Anita |
2023-02-23 00:00:00.000 | 2023-02-23 09:30:00.000 | 2023-02-23 20:00:00.000 | Anita |
I tried using a lag function to determine the overlapping segments so that I could cut off the times appropriately, but I don't seem to be getting the results I'm looking for.
DECLARE @tmpSchedules TABLE (
sDate DATETIME
,startTime DATETIME
,endTime DATETIME
,name varchar(255)
)
INSERT INTO @tmpSchedules
SELECT '20230223', '20230223 08:00', '20230223 10:00', 'John'
UNION ALL
SELECT '20230223', '20230223 10:30', '20230223 12:00', 'John'
UNION ALL
SELECT '20230223', '20230223 13:00', '20230223 17:00', 'John'
UNION ALL
SELECT '20230223', '20230223 08:30', '20230223 09:00', 'Anita'
UNION ALL
SELECT '20230223', '20230223 09:30', '20230223 20:00', 'Anita'
select * from @tmpSchedules
SELECT CA.sDate, CA.startTime, CA.endTime
,CASE
WHEN LAG(CA.endTime) OVER (ORDER BY CA.starttime) > CA.startTime THEN 0
ELSE 1
END OVL
FROM @tmpSchedules CA
Expected results would be similar to
sDate | startTime | endTime | Duration |
---|---|---|---|
2023-02-23 00:00:00 | 2023-02-23 08:30:00 | 2023-02-23 09:00:00 | 30 |
2023-02-23 00:00:00 | 2023-02-23 09:30:00 | 2023-02-23 10:00:00 | 30 |
2023-02-23 00:00:00 | 2023-02-23 10:30:00 | 2023-02-23 12:00:00 | 90 |
2023-02-23 00:00:00 | 2023-02-23 13:00:00 | 2023-02-23 17:00:00 | 240 |
答案1
得分: 0
一个可能的选项是类似以下的代码。您可以在[this Fiddle][1]中看到它的运行。
WITH ranges AS
(
SELECT t startTime, LEAD(t) OVER(ORDER BY t) endTime, DATEDIFF(mi, t, LEAD(t) OVER(ORDER BY t)) duration
FROM
(
SELECT startTime t FROM @tmpSchedules
UNION
SELECT endTime FROM @tmpSchedules
) c
)
SELECT t.sDate, r.startTime, r.endTime, r.duration
FROM ranges r INNER JOIN @tmpSchedules t
on r.startTime >= t.startTime AND r.endTime <= t.endTime
GROUP BY t.sDate, r.startTime, r.endTime, r.duration
HAVING COUNT(DISTINCT name) = (SELECT COUNT(DISTINCT name) FROM @tmpSchedules)
ORDER BY r.startTime
`ranges` CTE通过将`startTime`/`endTime`进行`union`来获取所有可能的区块。(即,如果一个人在8:00开始工作,直到8:30没有其他人开始或结束,那么8:00到8:30是一个区块,在整个时期内具有相同数量的工人)。对于这个操作并不是必须使用CTE,但我认为这样更容易阅读。
主查询将这些潜在范围与实际数据进行连接,`having`子句检查在该范围内工作的工人数量是否等于总工人数量。
这适用于您的测试数据,但可能需要调整如何处理边缘情况(例如,如果一个工人停止工作并在同一时间重新开始,这将报告为两个单独的范围),或者如果需要处理跨越到第二天的班次(我简单地使用`GROUP BY sDate`,因为不清楚如何/是否需要处理这种情况)。
[1]: https://dbfiddle.uk/zCw93hkp
英文:
One possible option is something like the following. You can see it working in this Fiddle.
WITH ranges AS
(
SELECT t startTime, LEAD(t) OVER(ORDER BY t) endTime, DATEDIFF(mi, t, LEAD(t) OVER(ORDER BY t)) duration
FROM
(
SELECT startTime t FROM @tmpSchedules
UNION
SELECT endTime FROM @tmpSchedules
) c
)
SELECT t.sDate, r.startTime, r.endTime, r.duration
FROM ranges r INNER JOIN @tmpSchedules t
on r.startTime >= t.startTime AND r.endTime <= t.endTime
GROUP BY t.sDate, r.startTime, r.endTime, r.duration
HAVING COUNT(DISTINCT name) = (SELECT COUNT(DISTINCT name) FROM @tmpSchedules)
ORDER BY r.startTime
The ranges
CTE gets all of the possible blocks by union
ing the startTime
s/endTime
s together. (i.e. If a person started work at 8:00 and no one else starts or ends until 8:30, then 8:00 to 8:30 is a block that has the same number of workers for the entire period). It's not necessary to use a CTE for this, but I thought it was easier to read.
The main query joins those potential ranges with the actual data and the having
clause checks whether the number of workers working in that range is equal to the total number of workers.
This works with your test data, but possibly you'd need to tweak how it handles edge cases (e.g. if a worker stopped working and restarted at the same time, this would report that as two separate ranges) or if you need to handle shifts that extend into the next day (I simply did GROUP BY sDate
since it wasn't clear how/whether that needed to be handled).
答案2
得分: 0
我对你的示例数据进行了加倍处理:
DECLARE @table TABLE (sDate DATE, startTime DATETIME, endTime DATETIME, name NVARCHAR(50))
INSERT INTO @table (sDate, startTime, endTime, name) VALUES
('2023-02-23 00:00:00.000', '2023-02-23 08:00:00.000', '2023-02-23 10:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 10:30:00.000', '2023-02-23 12:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 13:00:00.000', '2023-02-23 17:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 08:30:00.000', '2023-02-23 09:00:00.000', 'Anita'),
('2023-02-23 00:00:00.000', '2023-02-23 09:30:00.000', '2023-02-23 20:00:00.000', 'Anita');
INSERT INTO @table (sDate, startTime, endTime, name) VALUES
('2023-02-24 00:00:00.000', '2023-02-24 09:00:00.000', '2023-02-24 11:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 11:30:00.000', '2023-02-24 12:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 14:00:00.000', '2023-02-24 18:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 09:30:00.000', '2023-02-24 10:00:00.000', 'Anita'),
('2023-02-24 00:00:00.000', '2023-02-24 10:30:00.000', '2023-02-24 21:00:00.000', 'Anita');
然后使用以下代码获取答案:
DECLARE @SecondsRange INT = 599;
;WITH MinuteRanges AS (
SELECT CAST('00:00:00.0000000' AS TIME) AS RangeStart, DATEADD(MICROSECOND,-1,DATEADD(SECOND,@SecondsRange+1,CAST('00:00:00' AS TIME))) AS RangeEnd
UNION ALL
SELECT DATEADD(SECOND,@SecondsRange+1,RangeStart), DATEADD(SECOND,@SecondsRange+1,RangeEnd)
FROM MinuteRanges
WHERE DATEADD(SECOND,@SecondsRange+1,RangeStart) > RangeStart
), AllInRanges AS (
SELECT t.sDate, mr.RangeStart, mr.RangeEnd, STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) AS WhosIn, a.allNames,
CASE WHEN LAG(mr.RangeEnd,1) OVER (PARTITION BY sDate ORDER BY mr.RangeStart) IS NULL THEN 1
WHEN LAG(mr.RangeEnd,1) OVER (PARTITION BY sDate ORDER BY mr.RangeStart) <> DATEADD(MICROSECOND,-1,mr.RangeStart) THEN 1
END AS IsRangeStart
FROM MinuteRanges mr
LEFT OUTER JOIN @table t
ON mr.RangeStart >= CAST(t.startTime AS TIME)
AND mr.RangeEnd <= CAST(t.endTime AS TIME)
CROSS APPLY (SELECT STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) AS allNames FROM (SELECT name FROM @table GROUP BY name) a) a
GROUP BY t.sDate, mr.RangeStart, mr.RangeEnd, a.allNames
HAVING STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) = a.allNames
), Periods AS (
SELECT sDate, RangeStart, RangeEnd, WhosIn
FROM AllInRanges
WHERE IsRangeStart = 1
UNION ALL
SELECT a.sDate, a.RangeStart, r.RangeEnd, a.WhosIn
FROM Periods a
INNER JOIN AllInRanges r
ON DATEADD(MICROSECOND,1,a.RangeEnd) = r.RangeStart
AND a.sDate = r.sDate
)
SELECT sDate, RangeStart, MAX(RangeEnd) AS RangeEnd,
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)/3600) AS VARCHAR(2)),2)+':'+
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)%3600)/60 AS VARCHAR(2)),2)+':'+
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)%60) AS VARCHAR(2)),2) AS Duration,
WhosIn
FROM Periods
GROUP BY sDate, RangeStart, WhosIn
ORDER BY SDate, RangeStart
OPTION (MAXRECURSION 0)
英文:
I doubled up on your exampled data:
DECLARE @table TABLE (sDate DATE, startTime DATETIME, endTime DATETIME, name NVARCHAR(50))
INSERT INTO @table (sDate, startTime, endTime, name) VALUES
('2023-02-23 00:00:00.000', '2023-02-23 08:00:00.000', '2023-02-23 10:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 10:30:00.000', '2023-02-23 12:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 13:00:00.000', '2023-02-23 17:00:00.000', 'John'),
('2023-02-23 00:00:00.000', '2023-02-23 08:30:00.000', '2023-02-23 09:00:00.000', 'Anita'),
('2023-02-23 00:00:00.000', '2023-02-23 09:30:00.000', '2023-02-23 20:00:00.000', 'Anita');
INSERT INTO @table (sDate, startTime, endTime, name) VALUES
('2023-02-24 00:00:00.000', '2023-02-24 09:00:00.000', '2023-02-24 11:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 11:30:00.000', '2023-02-24 12:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 14:00:00.000', '2023-02-24 18:00:00.000', 'John'),
('2023-02-24 00:00:00.000', '2023-02-24 09:30:00.000', '2023-02-24 10:00:00.000', 'Anita'),
('2023-02-24 00:00:00.000', '2023-02-24 10:30:00.000', '2023-02-24 21:00:00.000', 'Anita');
Then used this to get an answer:
DECLARE @SecondsRange INT = 599;
;WITH MinuteRanges AS (
SELECT CAST('00:00:00.0000000' AS TIME) AS RangeStart, DATEADD(MICROSECOND,-1,DATEADD(SECOND,@SecondsRange+1,CAST('00:00:00' AS TIME))) AS RangeEnd
UNION ALL
SELECT DATEADD(SECOND,@SecondsRange+1,RangeStart), DATEADD(SECOND,@SecondsRange+1,RangeEnd)
FROM MinuteRanges
WHERE DATEADD(SECOND,@SecondsRange+1,RangeStart) > RangeStart
), AllInRanges AS (
SELECT t.sDate, mr.RangeStart, mr.RangeEnd, STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) AS WhosIn, a.allNames,
CASE WHEN LAG(mr.RangeEnd,1) OVER (PARTITION BY sDate ORDER BY mr.RangeStart) IS NULL THEN 1
WHEN LAG(mr.RangeEnd,1) OVER (PARTITION BY sDate ORDER BY mr.RangeStart) <> DATEADD(MICROSECOND,-1,mr.RangeStart) THEN 1
END AS IsRangeStart
FROM MinuteRanges mr
LEFT OUTER JOIN @table t
ON mr.RangeStart >= CAST(t.startTime AS TIME)
AND mr.RangeEnd <= CAST(t.endTime AS TIME)
CROSS APPLY (SELECT STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) AS allNames FROM (SELECT name FROM @table GROUP BY name) a) a
GROUP BY t.sDate, mr.RangeStart, mr.RangeEnd, a.allNames
HAVING STRING_AGG(name,', ') WITHIN GROUP (ORDER BY name) = a.allNames
), Periods AS (
SELECT sDate, RangeStart, RangeEnd, WhosIn
FROM AllInRanges
WHERE IsRangeStart = 1
UNION ALL
SELECT a.sDate, a.RangeStart, r.RangeEnd, a.WhosIn
FROM Periods a
INNER JOIN AllInRanges r
ON DATEADD(MICROSECOND,1,a.RangeEnd) = r.RangeStart
AND a.sDate = r.sDate
)
SELECT sDate, RangeStart, MAX(RangeEnd) AS RangeEnd,
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)/3600) AS VARCHAR(2)),2)+':'+
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)%3600)/60 AS VARCHAR(2)),2)+':'+
RIGHT('0' + CAST(((DATEDIFF(SECOND,RangeStart, MAX(RangeEnd))+1)%60) AS VARCHAR(2)),2) AS Duration,
WhosIn
FROM Periods
GROUP BY sDate, RangeStart, WhosIn
ORDER BY SDate, RangeStart
OPTION (MAXRECURSION 0)
There's a variable in there that let's us tweak the granularity. 599 is 10 minute windows. We're not going to bring back an overlaps smaller than that.
First we generate a complete set of ranges for 24 hours at the granularity we asked for MinutesRanges
.
Then we use that to find all the ranges where all the employees were clocked in at the same time, using the HAVING clause and comparing it to the CROSS APPLY where we fetch who all the employees are. There's a sneaky LAG windowed function in there too, that flags the start of a range for us AllInRanges
.
Once we know all the ranges we care about, we can use a rCTE
to group chain them together, using the first ranges RangeStart
Periods
.
Finally, we can select from Periods
and group by the sDate and RangeStart to give us the full ranges where all employees where in, and the duration of that.
sDate | RangeStart | RangeEnd | Duration | WhosIn |
---|---|---|---|---|
2023-02-23 | 08:30:00.0000000 | 08:59:59.9999990 | 00:30:00 | Anita, John |
2023-02-23 | 09:30:00.0000000 | 09:59:59.9999990 | 00:30:00 | Anita, John |
2023-02-23 | 10:30:00.0000000 | 11:59:59.9999990 | 01:30:00 | Anita, John |
2023-02-23 | 13:00:00.0000000 | 16:59:59.9999990 | 04:00:00 | Anita, John |
2023-02-24 | 09:30:00.0000000 | 09:59:59.9999990 | 00:30:00 | Anita, John |
2023-02-24 | 10:30:00.0000000 | 10:59:59.9999990 | 00:30:00 | Anita, John |
2023-02-24 | 11:30:00.0000000 | 11:59:59.9999990 | 00:30:00 | Anita, John |
2023-02-24 | 14:00:00.0000000 | 17:59:59.9999990 | 04:00:00 | Anita, John |
This might not be the most performant method, so YMMV.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论