英文:
Updating EndDate in DimEmployee table
问题
以下是您提供的内容的中文翻译:
我一直在编写一个脚本来创建一个Type2 DimEmployee表。我希望每当员工的详细信息发生更改时,它都会创建一个新的记录。
我相信我已经正确地完成了这部分工作。我卡住的地方是如何正确更新已更改记录的EndDate。
对于没有更改的记录,它的EndDate是NULL。
对于一个月是这样,下个月是那样的记录,它也能插入正确的EndDate值。
然而,当Staging_Employees中有多个相同的记录,然后发生更改时,它就不使用正确的EndDate值。
附注:分期表中的日期字段充当快照日期
有谁可以帮助我吗?
谢谢
例如:
如您所见,DimEmployee表中第一条记录的EndDate现在是2023年2月28日,这是StaffNo 4078之前出现不同记录的最后相同记录的日期。对于第二条记录,EndDate保持为NULL,因为它之后没有StaffNo 4078的记录。
使用的代码:
(此处省略了代码部分,如果需要继续翻译,请提供特定的代码段或问题。)
英文:
I have been working on a script to create a Type2 DimEmployee Table. I want it to create a new record every time there is a change to the employees details.
I believe I have this part working correctly. What I am stuck on is correctly updating the EndDate for records that have changed.
It works were records have not changed the EndDate is NULL
it also works for records were one month it was this and then the next month it was that (it inserts the correct EndDate value).
However where it is not working is when there has been multiple identical records in the Staging_Employees and then there has been a change. It does not use the correct EndDate value.
P.S the Date feild in the staging table acts as a snapshot date
can anyone help me with this?
Thanks
For example:
As you can see, the EndDate for the first record in the DimEmployee table is now 28/02/2023, the date of the last identical record before a different record appears for StaffNo 4078. For the second record, EndDate remains NULL as there are no more records for StaffNo 4078 after it.
Code Used:
BEGIN TRY
TRUNCATE TABLE DimEmployee;
END TRY
BEGIN CATCH
CREATE TABLE DimEmployee (
DimEmployeeID INT IDENTITY(1,1) PRIMARY KEY,
StructureID INT,
StaffNo INT NOT NULL,
EmployeeID INT,
Position varchar(20),
JobTitle VARCHAR(100),
ContractType VARCHAR(50),
AverageHoursPerWeek DECIMAL(5,2),
WeeksPeryr DECIMAL(5,2),
HoursPerWeek DECIMAL(5,2),
PublicHolidayZone VARCHAR(50),
FTE DECIMAL(5,2),
AnalysisGroup VARCHAR(50),
EffectiveDate DATE,
EndDate DATE
);
END CATCH;
-- Insert new records with changes in the specified fields
WITH ChangedRecords AS (
SELECT
s.StaffNo,
RANK() OVER (ORDER BY s.StaffNo) AS EmployeeID,
s.Position,
s.JobTitle,
s.ContractType,
s.AverageHoursPerWeek,
s.WeeksPeryr,
s.HoursPerWeek,
s.PublicHolidayZone,
s.FTE,
s.AnalysisGroup,
d.StructureID,
s.Date,
LAG(d.StructureID) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousStructureID,
LAG(s.Position) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousPosition,
LAG(s.JobTitle) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousJobTitle,
LAG(s.ContractType) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousContractType,
LAG(s.AverageHoursPerWeek) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousAverageHoursPerWeek,
LAG(s.WeeksPeryr) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousWeeksPeryr,
LAG(s.HoursPerWeek) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousHoursPerWeek,
LAG(s.PublicHolidayZone) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousPublicHolidayZone,
LAG(s.FTE) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousFTE,
LAG(s.AnalysisGroup) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousAnalysisGroup,
ROW_NUMBER() OVER (PARTITION BY s.StaffNo, YEAR(s.Date), MONTH(s.Date), DAY(s.Date) ORDER BY s.Date) AS RowNum
FROM Staging_Employees AS s
JOIN DimStructure AS d ON
s.Directorate = d.Directorate AND
s.Service = d.Service AND
s.Section = d.Section AND
s.Team = d.Team
)
INSERT INTO DimEmployee (StructureID, StaffNo, EmployeeID, Position, JobTitle, ContractType, AverageHoursPerWeek, WeeksPeryr, HoursPerWeek, PublicHolidayZone, FTE, AnalysisGroup, EffectiveDate, EndDate)
SELECT StructureID, StaffNo, EmployeeID, Position, JobTitle, ContractType, AverageHoursPerWeek, WeeksPeryr, HoursPerWeek, PublicHolidayZone, FTE, AnalysisGroup, Date, NULL
FROM ChangedRecords
WHERE (PreviousStructureID IS NULL OR PreviousStructureID <> StructureID
OR PreviousPosition <> Position
OR PreviousJobTitle <> JobTitle
OR PreviousContractType <> ContractType
OR PreviousAverageHoursPerWeek <> AverageHoursPerWeek
OR PreviousWeeksPeryr <> WeeksPeryr
OR PreviousHoursPerWeek <> HoursPerWeek
OR PreviousPublicHolidayZone <> PublicHolidayZone
OR PreviousFTE <> FTE
OR PreviousAnalysisGroup <> AnalysisGroup);
-- Update EndDate for old records with changes in the specified fields
WITH ChangedRecords AS (
SELECT
s.StaffNo,
RANK() OVER (ORDER BY s.StaffNo) AS EmployeeID,
s.Position,
s.JobTitle,
s.ContractType,
s.AverageHoursPerWeek,
s.WeeksPeryr,
s.HoursPerWeek,
s.PublicHolidayZone,
s.FTE,
s.AnalysisGroup,
d.StructureID,
s.Date,
LAG(d.StructureID) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousStructureID,
LAG(s.Position) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousPosition,
LAG(s.JobTitle) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousJobTitle,
LAG(s.ContractType) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousContractType,
LAG(s.AverageHoursPerWeek) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousAverageHoursPerWeek,
LAG(s.WeeksPeryr) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousWeeksPeryr,
LAG(s.HoursPerWeek) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousHoursPerWeek,
LAG(s.PublicHolidayZone) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousPublicHolidayZone,
LAG(s.FTE) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousFTE,
LAG(s.AnalysisGroup) OVER (PARTITION BY s.StaffNo ORDER BY s.Date) AS PreviousAnalysisGroup
FROM Staging_Employees AS s
JOIN DimStructure AS d ON
s.Directorate = d.Directorate AND
s.Service = d.Service AND
s.Section = d.Section AND
s.Team = d.Team
)
UPDATE de
SET EndDate = (
SELECT MAX(se.Date)
FROM Staging_Employees se
WHERE de.StaffNo = se.StaffNo
AND se.Date <= de.EffectiveDate
)
FROM DimEmployee de
WHERE de.EndDate IS NULL
AND EXISTS (
SELECT 1
FROM ChangedRecords cr
WHERE cr.StaffNo = de.StaffNo
AND cr.EmployeeID = de.EmployeeID
AND cr.Date > de.EffectiveDate
AND (
cr.PreviousStructureID <> cr.StructureID
OR cr.PreviousPosition <> cr.Position
OR cr.PreviousJobTitle <> cr.JobTitle
OR cr.PreviousContractType <> cr.ContractType
OR cr.PreviousAverageHoursPerWeek <> cr.AverageHoursPerWeek
OR cr.PreviousWeeksPeryr <> cr.WeeksPeryr
OR cr.PreviousHoursPerWeek <> cr.HoursPerWeek
OR cr.PreviousPublicHolidayZone <> cr.PublicHolidayZone
OR cr.PreviousFTE <> cr.FTE
OR cr.PreviousAnalysisGroup <> cr.AnalysisGroup
)
);
答案1
得分: 0
根据您提供的内容,涉及更新多次更改的记录的EndDate的问题可能与您用于获取新EndDate的子查询有关,因为它似乎只考虑了小于或等于记录被更新的EffectiveDate的Staging_Employees
表中的最大日期。
这个子查询可能没有考虑到在这个范围内可能发生的多次更改。
要解决此问题,您需要修改子查询,以考虑小于或等于EffectiveDate并且大于先前EndDate的记录的最大日期。这将确保您选择了具有多次更改的记录的正确EndDate。
-- 更新具有指定字段更改的旧记录的EndDate
WITH ChangedRecords AS (
-- 与之前相同
)
UPDATE de
SET EndDate = (
SELECT MAX(se.Date)
FROM Staging_Employees se
WHERE de.StaffNo = se.StaffNo
AND se.Date <= de.EffectiveDate
AND se.Date > COALESCE(de.EndDate, '1900-01-01') -- 考虑在先前EndDate之后的日期
)
FROM DimEmployee de
WHERE de.EndDate IS NULL
AND EXISTS (
SELECT 1
FROM ChangedRecords cr
WHERE cr.StaffNo = de.StaffNo
AND cr.EmployeeID = de.EmployeeID
AND cr.Date > de.EffectiveDate
AND (
cr.PreviousStructureID <> cr.StructureID
OR cr.PreviousPosition <> cr.Position
OR cr.PreviousJobTitle <> cr.JobTitle
OR cr.PreviousContractType <> cr.ContractType
OR cr.PreviousAverageHoursPerWeek <> cr.AverageHoursPerWeek
OR cr.PreviousWeeksPeryr <> cr.WeeksPeryr
OR cr.PreviousHoursPerWeek <> cr.HoursPerWeek
OR cr.PreviousPublicHolidayZone <> cr.PublicHolidayZone
OR cr.PreviousFTE <> cr.FTE
OR cr.PreviousAnalysisGroup <> cr.AnalysisGroup
)
);
希望这可以帮助您解决问题。
英文:
Based on what you provided, the issue with updating the EndDate for records that have changed multiple times may be related to the subquery you are using to get the new EndDate
since it seems that it only considers the maximum date from the Staging_Employees
table that is less than or equal to the EffectiveDate
of the record being updated.
This subquery may not take into account the multiple changes that might have occurred within that range.
To resolve this issue, you need to modify the subquery to consider the maximum date that is both less than or equal to the EffectiveDate
and also greater than the previous EndDate
of the record being updated. This will ensure that you're selecting the correct EndDate
for records that have multiple changes.
-- Update EndDate for old records with changes in the specified fields
WITH ChangedRecords AS (
-- Same as before
)
UPDATE de
SET EndDate = (
SELECT MAX(se.Date)
FROM Staging_Employees se
WHERE de.StaffNo = se.StaffNo
AND se.Date <= de.EffectiveDate
AND se.Date > COALESCE(de.EndDate, '1900-01-01') -- Consider dates after the previous EndDate
)
FROM DimEmployee de
WHERE de.EndDate IS NULL
AND EXISTS (
SELECT 1
FROM ChangedRecords cr
WHERE cr.StaffNo = de.StaffNo
AND cr.EmployeeID = de.EmployeeID
AND cr.Date > de.EffectiveDate
AND (
cr.PreviousStructureID <> cr.StructureID
OR cr.PreviousPosition <> cr.Position
OR cr.PreviousJobTitle <> cr.JobTitle
OR cr.PreviousContractType <> cr.ContractType
OR cr.PreviousAverageHoursPerWeek <> cr.AverageHoursPerWeek
OR cr.PreviousWeeksPeryr <> cr.WeeksPeryr
OR cr.PreviousHoursPerWeek <> cr.HoursPerWeek
OR cr.PreviousPublicHolidayZone <> cr.PublicHolidayZone
OR cr.PreviousFTE <> cr.FTE
OR cr.PreviousAnalysisGroup <> cr.AnalysisGroup
)
);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论