英文:
Counting Sick days over the weekend
问题
我正在尝试解决以下(简化的)数据集中的问题:
姓名 | 日期 | 工作日 | 日历日 | 请假 |
---|---|---|---|---|
人员A | 2023-01-01 | 0 | 1 | NULL |
人员A | 2023-01-07 | 0 | 1 | NULL |
人员A | 2023-01-08 | 0 | 1 | NULL |
人员A | 2023-01-13 | 1 | 1 | 病假 |
人员A | 2023-01-14 | 0 | 1 | NULL |
人员A | 2023-01-15 | 0 | 1 | NULL |
人员A | 2023-01-16 | 1 | 1 | 病假 |
人员A | 2023-01-20 | 1 | 1 | 节假日 |
人员A | 2023-01-21 | 0 | 1 | NULL |
人员A | 2023-01-22 | 0 | 1 | NULL |
人员A | 2023-01-23 | 1 | 1 | 节假日 |
人员B | 2023-01-01 | 0 | 1 | NULL |
人员B | 2023-01-02 | 1 | 1 | 病假 |
人员B | 2023-01-03 | 1 | 1 | 病假 |
我想要的结果如下:
姓名 | 请假 | 请假开始日期 | 请假结束日期 | 工作日 | 周日日 |
---|---|---|---|---|---|
人员A | 病假 | 2023-01-13 | 2023-01-16 | 2 | 4 |
人员A | 节假日 | 2023-01-20 | 2023-01-23 | 2 | 4 |
人员B | 病假 | 2023-01-02 | 2023-01-03 | 2 | 2 |
英文:
I'm trying to solve a problem in the following (simplified) dataset:
Name | Date | Workday | Calenderday | Leave |
---|---|---|---|---|
PersonA | 2023-01-01 | 0 | 1 | NULL |
PersonA | 2023-01-07 | 0 | 1 | NULL |
PersonA | 2023-01-08 | 0 | 1 | NULL |
PersonA | 2023-01-13 | 1 | 1 | Sick |
PersonA | 2023-01-14 | 0 | 1 | NULL |
PersonA | 2023-01-15 | 0 | 1 | NULL |
PersonA | 2023-01-16 | 1 | 1 | Sick |
PersonA | 2023-01-20 | 1 | 1 | Holiday |
PersonA | 2023-01-21 | 0 | 1 | NULL |
PersonA | 2023-01-22 | 0 | 1 | NULL |
PersonA | 2023-01-23 | 1 | 1 | Holiday |
PersonB | 2023-01-01 | 0 | 1 | NULL |
PersonB | 2023-01-02 | 1 | 1 | Sick |
PersonB | 2023-01-03 | 1 | 1 | Sick |
Where the lines with NULL in [Leave] is weekend.
What I want is a result looking like this:
Name | Leave | PeriodStartDate | PeriodEndDate | Workdays | Weekdays |
---|---|---|---|---|---|
PersonA | Sick | 2023-01-13 | 2023-01-16 | 2 | 4 |
PersonA | Holiday | 2023-01-20 | 2023-01-23 | 2 | 4 |
PersonB | Sick | 2023-01-02 | 2023-01-03 | 2 | 2 |
where the difference between [Workdays] and [Weekdays] is that weekdays also counts the weekend.
What I have been trying is to first make a row (in two different ways)
ROW_NUMBER() OVER (PARTITION BY \[Name\] ORDER BY \[Date\]) as RowNo1
ROW_NUMBER() OVER (PARTITION BY \[Name\], \[Leave\] ORDER BY \[Date\]) as RowNo2
and after that to make a period base date:
DATEADD(DAY, 0 - \[RowNo1\], Date) as PeriodBaseDate1
,DATEADD(DAY, 0 - \[RowNo2\], \[Date\]) as PeriodBaseDate2
and after that do something like this:
MIN(\[Date\]) as PeriodStartDate
,MAX(\[Dato\]) as PeriodEndDate
,SUM(\[Calenderday\]) as Weekdays
,SUM(\[Workday\]) as Workdays
GROUP BY \[PeriodBaseDate (1 or 2?)\], \[Leave\], \[Name\]
But whatever I do I can't seem to get it to count the weekends in the periods.
It doesn't have to include my try with the RowNo, PeriodBaseDate etc.
答案1
得分: 1
以下是翻译好的部分:
我不确定你尝试做什么。根据我的理解,下面的脚本会产生预期的输出。
SELECT Name, Leave, Min(Date) PeriodStartDate,Max(Date) PeriodEndDate, SUM(Workday) Workdays, DATEDIFF(DAY,Min(Date),Max(Date))+ 1 Weekdays from YourTable
WHERE Leave IS NOT NULL
GROUP BY Name, Leave
英文:
I am not sure what you are trying to do. Based on what I understood, below script gives the expected output.
SELECT Name, Leave, Min(Date) PeriodStartDate,Max(Date) PeriodEndDate, SUM(Workday) Workdays, DATEDIFF(DAY,Min(Date),Max(Date))+ 1 Weekdays from YourTable
WHERE Leave IS NOT NULL
GROUP BY Name, Leave
答案2
得分: 1
由于我们没有您的实际完整解决方案,我提供了一个完整的可工作的解决方案。首先,我使用LAST_VALUE
来为所有行提供一个Leave
的值(前提是之前存在非NULL
的值)。
一旦我这样做,您就会面临一个间隙和孤立问题,可以根据它进行聚合。
我假设您正在使用SQL Server 2022,这是写作时的最新版本,因为未提供版本详细信息,因此可以使用IGNORE NULLS
语法。
SELECT *
INTO dbo.YourTable
FROM ('PersonA',CONVERT(date,'2023-01-01'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-07'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-08'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-13'),1,1,'Sick'),
('PersonA',CONVERT(date,'2023-01-14'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-15'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-16'),1,1,'Sick'),
('PersonA',CONVERT(date,'2023-01-20'),1,1,'Holiday'),
('PersonA',CONVERT(date,'2023-01-21'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-22'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-23'),1,1,'Holiday'),
('PersonB',CONVERT(date,'2023-01-01'),0,1,NULL),
('PersonB',CONVERT(date,'2023-01-02'),1,1,'Sick'),
('PersonB',CONVERT(date,'2023-01-03'),1,1,'Sick'))V(Name,Date,Workday,Calenderday,Leave);
GO
WITH Leaves AS(
SELECT Name,
[Date],
Workday,
Calenderday, --It's spelt Calendar, you should correct this typographical error as objects with typos lead to further problems.
--Leave,
LAST_VALUE(Leave) IGNORE NULLS OVER (PARTITION BY Name ORDER BY Date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Leave
FROM dbo.YourTable YT),
LeaveGroups AS(
SELECT Name,
[Date],
Workday,
CalenderDay,
Leave,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY Name, Leave ORDER BY Date) AS Grp
FROM Leaves)
SELECT Name,
Leave,
MIN([Date]) AS PeriodStartDate,
MAX([Date]) AS PeriodEndDate,
SUM(WorkDay) AS WorkDays, --Assumes Workday is not a bit, if it is, CAST or CONVERT it to an int
DATEDIFF(DAY,MIN([Date]), MAX([Date]))+1 AS Weekdays
--SUM(CASE WHEN (DATEPART(WEEKDAY,[Date]) + @@DATEFIRST + 5) % 7 BETWEEN 0 AND 4 THEN 1 END) AS Weekdays --This method is language agnostic
FROM LeaveGroups
WHERE Leave IS NOT NULL
GROUP BY Name,
Leave,
Grp
ORDER BY Name,
PeriodStartDate;
GO
DROP TABLE dbo.YourTable;
英文:
As we don't have your actual full solutions, I've provided a full working one. I firstly use LAST_VALUE
to have all the rows have a value for their Leave
(provided there was a non-NULL
value previously).
Once I do that, you have a gaps and island problem, and can aggregate based on that.
I assume you are using SQL Server 2022, the latest version of SQL Server at the time of writing, as no version details are given and thus have access to the IGNORE NULLS
syntax.
SELECT *
INTO dbo.YourTable
FROM (VALUES('PersonA',CONVERT(date,'2023-01-01'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-07'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-08'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-13'),1,1,'Sick'),
('PersonA',CONVERT(date,'2023-01-14'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-15'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-16'),1,1,'Sick'),
('PersonA',CONVERT(date,'2023-01-20'),1,1,'Holiday'),
('PersonA',CONVERT(date,'2023-01-21'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-22'),0,1,NULL),
('PersonA',CONVERT(date,'2023-01-23'),1,1,'Holiday'),
('PersonB',CONVERT(date,'2023-01-01'),0,1,NULL),
('PersonB',CONVERT(date,'2023-01-02'),1,1,'Sick'),
('PersonB',CONVERT(date,'2023-01-03'),1,1,'Sick'))V(Name,Date,Workday,Calenderday,Leave);
GO
WITH Leaves AS(
SELECT Name,
[Date],
Workday,
Calenderday, --It's spelt Calendar, you should correct this typopgraphical error as objects with typoes lead to further problems.
--Leave,
LAST_VALUE(Leave) IGNORE NULLS OVER (PARTITION BY Name ORDER BY Date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Leave
FROM dbo.YourTable YT),
LeaveGroups AS(
SELECT Name,
[Date],
Workday,
CalenderDay,
Leave,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY Name, Leave ORDER BY Date) AS Grp
FROM Leaves)
SELECT Name,
Leave,
MIN([Date]) AS PeriodStartDate,
MAX([Date]) AS PeriodEndDate,
SUM(WorkDay) AS WorkDays, --Assumes Workday is not a bit, if it is, CAST or CONVERT it to a int
DATEDIFF(DAY,MIN([Date]), MAX([Date]))+1 AS Weekdays
--SUM(CASE WHEN (DATEPART(WEEKDAY,[Date]) + @@DATEFIRST + 5) % 7 BETWEEN 0 AND 4 THEN 1 END) AS Weekdays --This method is language agnostic
FROM LeaveGroups
WHERE Leave IS NOT NULL
GROUP BY Name,
Leave,
Grp
ORDER BY Name,
PeriodStartDate;
GO
DROP TABLE dbo.YourTable;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论