2023年7月28日 06:07:08go评论127阅读模式

英文:

Query to find problems with missing row pairs, given a partition?

问题

我有一个表格，其中信息被存储如下：

工作订单	员工	功能组	功能类型	时间戳
WO1	Emp1	Group1	Start	7/27/23 09:00
WO1	Emp1	Group1	Stop	7/27/23 10:00
WO1	Emp1	Group1	Start	7/27/23 11:00
WO1	Emp1	Group1	Stop	7/27/23 12:00

WO2	Emp2	Group2	Start	7/27/23 13:00
WO2	Emp2	Group2	Stop	7/27/23 14:00
WO2	Emp2	Group2	Start	7/27/23 15:00
WO2	Emp2	Group2	Stop	7/27/23 16:00

WO3	Emp3	Group3	Start	7/27/23 17:00 （问题在这里：因为下一行也是一个Start，所以这一行应该返回）
WO3	Emp3	Group3	Start	7/27/23 18:00
WO3	Emp3	Group3	Start	7/27/23 19:00
WO3	Emp3	Group3	Stop	7/27/23 20:00

WO4	Emp4	Group4	Stop	7/27/23 17:00 （问题在这里：因为这个分区的数据集以一个Stop而不是一个Start开始，所以这一行应该返回）
WO4	Emp4	Group4	Start	7/27/23 18:00
WO4	Emp4	Group4	Start	7/27/23 19:00
WO4	Emp4	Group4	Stop	7/27/23 20:00

对于每个工作订单、员工和功能组，员工可以插入一对Start和Stop行（这基本上是如何定义分区的）。它们必须始终以Start然后Stop的顺序出现，不能反过来。这个数据收集将从特定的日期/时间开始，所以有一个干净的点，所有的数据必须从Start开始。他们可以根据需要插入这些对。我需要编写一个查询，检查是否存在与这些配对有关的问题，如果有问题，返回首次出现此问题的行。

在上面的表中，最后两个部分显示了一个潜在的问题和必须返回的行。这里的主要挑战只是找出哪些行应该成对出现。一个替代方法是考虑只需在每个Start时插入唯一的ID，然后在每个Stop时插入相同的ID。这可能有效，但现在我需要一个查询，可以显示我正在使用的测试数据中的问题。

英文:

I have a table where information is being stored like so:

Work Order	Employee	FunctionGroup	FunctionType	Timestamp
WO1	Emp1	Group1	Start	7/27/23 09:00
WO1	Emp1	Group1	Stop	7/27/23 10:00
WO1	Emp1	Group1	Start	7/27/23 11:00
WO1	Emp1	Group1	Stop	7/27/23 12:00

WO2	Emp2	Group2	Start	7/27/23 13:00
WO2	Emp2	Group2	Stop	7/27/23 14:00
WO2	Emp2	Group2	Start	7/27/23 15:00
WO2	Emp2	Group2	Stop	7/27/23 16:00

WO3	Emp3	Group3	Start	7/27/23 17:00 (problem here: since the next row is also a Start, then this row should be returned)
WO3	Emp3	Group3	Start	7/27/23 18:00
WO3	Emp3	Group3	Start	7/27/23 19:00
WO3	Emp3	Group3	Stop	7/27/23 20:00

WO4	Emp4	Group4	Stop	7/27/23 17:00 (problem here: since the dataset for this partition starts with a Stop instead of a Start, then this row should be returned)
WO4	Emp4	Group4	Start	7/27/23 18:00
WO4	Emp4	Group4	Start	7/27/23 19:00
WO4	Emp4	Group4	Stop	7/27/23 20:00

For each work order, employee, and function group, an employee can insert a Start and a Stop row (this is basically how the partition is defined). They must always be in the order of Start then Stop and cannot be backwards. This data collection will all start on a specific date/time, so there is a clean point where all the data must come in beginning with a Start. They can insert these pairs as many times as the need to. I need to write a query that checks to see if there is a problem with the pairings and if so - return the first row where this problem appeared.

In the table above, the last 2 sections show a potential problem and the row that must be returned. The main challenge here is just figuring out how to determine which rows go together as a pair. One alternative way I am thinking of handling this is to simply insert a unique ID with every Start, then insert that same ID with every Stop. This might work, but for now I need a query that can show me problems for the test data I'm using.

答案1

得分: 3

根据数据应该以开始/停止配对的期望，您可以分配一个行号，然后将期望的FunctionType与实际的FunctionType进行比较，或者更确切地说，因为您希望错误出现在前一行，所以将期望的FunctionType与下一个 FunctionType（使用LEAD）进行比较。

declare @TestData table (WorkOrder varchar(3), Employee varchar(4), FunctionGroup varchar(6), FunctionType varchar(5), [Timestamp] datetime)
insert into @TestData (WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp])
values
('WO1','Emp1','Group1','Start','7/27/23 09:00'),
('WO1','Emp1','Group1','Stop','7/27/23 10:00'),
('WO1','Emp1','Group1','Start','7/27/23 11:00'),
('WO1','Emp1','Group1','Stop','7/27/23 12:00'),
('WO2','Emp2','Group2','Start','7/27/23 13:00'),
('WO2','Emp2','Group2','Stop','7/27/23 14:00'),
('WO2','Emp2','Group2','Start','7/27/23 15:00'),
('WO2','Emp2','Group2','Stop','7/27/23 16:00'),
('WO3','Emp3','Group3','Start','7/27/23 17:00'),-- (问题在这里: 由于下一行也是Start，所以应该返回此行)
('WO3','Emp3','Group3','Start','7/27/23 18:00'),
('WO3','Emp3','Group3','Start','7/27/23 19:00'),
('WO3','Emp3','Group3','Stop','7/27/23 20:00'),
('WO4','Emp4','Group4','Stop','7/27/23 17:00'),-- (问题在这里: 由于该分区的数据集以Stop而不是Start开头，所以应该返回此行)
('WO4','Emp4','Group4','Start','7/27/23 18:00'),
('WO4','Emp4','Group4','Start','7/27/23 19:00'),
('WO4','Emp4','Group4','Stop','7/27/23 20:00');
with cte as (
  select *
    , row_number() over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) rn
    , lead(FunctionType) over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) FunctionTypeLead
  from @TestData
)
select WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp]
from cte
where rn%2 = 1 and FunctionTypeLead != 'Stop'
order by WorkOrder, Employee, FunctionGroup, [Timestamp];

结果：

WorkOrder	Employee	FunctionGroup	FunctionType	Timestamp
WO3	Emp3	Group3	Start	2023-07-27 17:00:00.000
WO4	Emp4	Group4	Stop	2023-07-27 17:00:00.000

注意：提供DDL+DML（如上所示）可以更容易回答问题。

英文:

So based on the expectation that the data should be in start/stop pairs, you can allocate a row number and then compare the expected FunctionType with the actual FunctionType, or rather, since you want the error to appear on the line before, the next FunctionType (using LEAD) with the expected FunctionType.

declare @TestData table (WorkOrder varchar(3), Employee varchar(4), FunctionGroup varchar(6), FunctionType varchar(5), [Timestamp] datetime)
insert into @TestData (WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp])
values
(&#39;WO1&#39;,&#39;Emp1&#39;,&#39;Group1&#39;,&#39;Start&#39;,&#39;7/27/23 09:00&#39;),
(&#39;WO1&#39;,&#39;Emp1&#39;,&#39;Group1&#39;,&#39;Stop&#39;,&#39;7/27/23 10:00&#39;),
(&#39;WO1&#39;,&#39;Emp1&#39;,&#39;Group1&#39;,&#39;Start&#39;,&#39;7/27/23 11:00&#39;),
(&#39;WO1&#39;,&#39;Emp1&#39;,&#39;Group1&#39;,&#39;Stop&#39;,&#39;7/27/23 12:00&#39;),
(&#39;WO2&#39;,&#39;Emp2&#39;,&#39;Group2&#39;,&#39;Start&#39;,&#39;7/27/23 13:00&#39;),
(&#39;WO2&#39;,&#39;Emp2&#39;,&#39;Group2&#39;,&#39;Stop&#39;,&#39;7/27/23 14:00&#39;),
(&#39;WO2&#39;,&#39;Emp2&#39;,&#39;Group2&#39;,&#39;Start&#39;,&#39;7/27/23 15:00&#39;),
(&#39;WO2&#39;,&#39;Emp2&#39;,&#39;Group2&#39;,&#39;Stop&#39;,&#39;7/27/23 16:00&#39;),
(&#39;WO3&#39;,&#39;Emp3&#39;,&#39;Group3&#39;,&#39;Start&#39;,&#39;7/27/23 17:00&#39;),-- (problem here: since the next row is also a Start, then this row should be returned)
(&#39;WO3&#39;,&#39;Emp3&#39;,&#39;Group3&#39;,&#39;Start&#39;,&#39;7/27/23 18:00&#39;),
(&#39;WO3&#39;,&#39;Emp3&#39;,&#39;Group3&#39;,&#39;Start&#39;,&#39;7/27/23 19:00&#39;),
(&#39;WO3&#39;,&#39;Emp3&#39;,&#39;Group3&#39;,&#39;Stop&#39;,&#39;7/27/23 20:00&#39;),
(&#39;WO4&#39;,&#39;Emp4&#39;,&#39;Group4&#39;,&#39;Stop&#39;,&#39;7/27/23 17:00&#39;),-- (problem here: since the dataset for this partition starts with a Stop instead of a Start, then this row should be returned)
(&#39;WO4&#39;,&#39;Emp4&#39;,&#39;Group4&#39;,&#39;Start&#39;,&#39;7/27/23 18:00&#39;),
(&#39;WO4&#39;,&#39;Emp4&#39;,&#39;Group4&#39;,&#39;Start&#39;,&#39;7/27/23 19:00&#39;),
(&#39;WO4&#39;,&#39;Emp4&#39;,&#39;Group4&#39;,&#39;Stop&#39;,&#39;7/27/23 20:00&#39;);
with cte as (
  select *
    , row_number() over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) rn
    , lead(FunctionType) over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) FunctionTypeLead
  from @TestData
)
select WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp]
from cte
where rn%2 = 1 and FunctionTypeLead != &#39;Stop&#39;
order by WorkOrder, Employee, FunctionGroup, [Timestamp];

Returns:

WorkOrder	Employee	FunctionGroup	FunctionType	Timestamp
WO3	Emp3	Group3	Start	2023-07-27 17:00:00.000
WO4	Emp4	Group4	Stop	2023-07-27 17:00:00.000

Note: Providing the DDL+DML (as shown here) makes it much easier to answer.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

查询以查找在给定分区中缺少行对的问题？

问题

答案1

如何按货运方式获取供应商并显示其订单数量最多的方式？

20个字符字符串的形成，由五个字段拼接而成。

如何在 TVF 中的视图上使用 FOR SYSTEM_TIME 子句？

如何在Java中迭代表格行？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。