2023年3月9日 23:55:54go评论65阅读模式

英文:

Count distinct based on order of another column

问题

我有一个包含测量数据的表，它看起来像这样：

SELECT [Plant] --工厂名称
       ,[Machine] --机器名称
       ,[Material] --材料名称
       ,[Batch] --批次名称
       ,[DMC] --数据矩阵代码，每个部件都有一个
       ,[ProductionTime] --时间戳（UTC）
       ,[Result] --0/1表示NOK/OK
FROM [dbo].[Results]

每一行都代表一次测量，具有一个结果。每个被测试的零件都有一个DMC。由于零件可以被多次测试，因此在此表中会出现重复的DMC（具有不同的结果）。

示例数据：

工厂	机器	材料	批次	DMC	生产时间	结果
A	机器A	材料A	X	ABC	2023-02-16 16:21:52	1
A	机器A	材料A	X	DEF	2023-02-16 16:21:30	1
A	机器A	材料A	X	DEF	2023-02-16 16:21:09	0
A	机器A	材料A	Y	GHI	2023-02-16 16:20:47	1
A	机器B	材料A	X	JKL	2023-02-16 16:20:24	0
A	机器B	材料B	Y	MNO	2023-02-16 16:20:03	1

要计算废品率，我需要计算OK和NOK零件的数量。
对于#OKParts，条件是：
-#OKParts：对于每个DMC，考虑其最新的生产时间，然后计算结果等于1的行数
(-#NOKParts相同，但结果等于0)

此外，为了计算FTT-Yield（“第一次通过”）：
-#PartsFirstTestOK：对于每个DMC，考虑其最旧的生产时间，然后计算结果等于1的行数
(-#PartsFirstTestNOK：相同，但结果等于0)

由于这个表非常庞大，我需要通过筛选表中的数据（请参阅下面的where语句）来减少数据量，并对结果进行分组。

现在，我的查询如下所示：

SELECT [Plant]
      ,[Machine]
      ,[Material]
      ,[Batch]
      ,Count([Result]) as '#Tests' --所有行
      ,Count(Distinct [DMC]) as '#Parts' --所有不同的DMC
      ,COUNT(CASE when [Result] = 1 THEN 1 END) as '#OKTests'
      ,COUNT(CASE when [Result] = 0 THEN 1 END) as '#NOKTests'
  FROM table
  where [Plant] = 'A' and [ProductionTime] > DATEADD(DAY, -365,GETDATE())
  group by [Plant],[Material],[Batch],[Machine]

（在这里，用Sum(Cast([Result] as INT))和Count([Result])-Sum(Cast([Result] as INT))来代替CASE函数也可以）

以获得（如现在）以下结果：

工厂	机器	材料	批次	#Tests	#Parts	#OKTests	#NOKTests
A	机器A	材料A	X	3	3	3	0
A	机器A	材料A	Y	124	96	93	31
A	机器B	材料A	X	11	9	9	2
A	机器B	材料B	Y	21	13	11	10

我想要通过添加这些列的方式来获得这个表#OKParts，#NOKParts，#PartsFirstTestOK，#PartsFirstTestNOK。

我尝试创建一个子查询、使用FIRST_VALUE和OVER函数，但都没有成功。我在SQL方面是初学者。

有人有解决方法吗？非常感谢！

英文:

I have a table containing measurements, it looks like this:

SELECT [Plant] --plant name
       ,[Machine] --machine name
       ,[Material] --material name
       ,[Batch] --batch name
       ,[DMC] --data matrix code, each part has one
       ,[ProductionTime] --time stamp UTC
       ,[Result] --0/1 as NOK/OK
FROM [dbo].[Results]

Each row is one measurement, having a result. Each tested part has a DMC. Since parts can be tested several times, we have duplicated DMCs (with different results) in this table.

Example data:

Plant	Machine	Material	Batch	DMC	ProductionTime	Result
A	MachineA	MaterialA	X	ABC	2023-02-16 16:21:52	1
A	MachineA	MaterialA	X	DEF	2023-02-16 16:21:30	1
A	MachineA	MaterialA	X	DEF	2023-02-16 16:21:09	0
A	MachineA	MaterialA	Y	GHI	2023-02-16 16:20:47	1
A	MachineB	MaterialA	X	JKL	2023-02-16 16:20:24	0
A	MachineB	MaterialB	Y	MNO	2023-02-16 16:20:03	1

To calculate the scrap rate, I need to count the amount of OK and NOK parts.
For #OKParts, conditions are:
-#OKParts: for each DMC, consider its **latest **production time and calculate how many rows have result = 1
(-#NOKParts same but result = 0)

Additionally for calculating FTT-Yield ("first time through"):
-#PartsFirstTestOK: for each DMC, consider its oldest production time and calculate how many rows have result = 1
(-#PartsFirstTestNOK: same but result = 0)

Since this table is very big, I need to reduce the amount of data by filtering the table (see below the where statement) and group the results.

Right now, my query looks like this:

SELECT [Plant]
      ,[Machine]
      ,[Material]
      ,[Batch]
      ,Count([Result]) as &#39;#Tests&#39; --all rows
      ,Count(Distinct [DMC]) as &#39;#Parts&#39; --all different DMCs
      ,COUNT(CASE when [Result] = 1 THEN 1 END) as &#39;#OKTests&#39;
      ,COUNT(CASE when [Result] = 0 THEN 1 END) as &#39;#NOKTests&#39;
  FROM table
  where [Plant] = &#39;A&#39; and [ProductionTime] &gt; DATEADD(DAY, -365,GETDATE())
  group by [Plant],[Material],[Batch],[Machine]

(Sum(Cast([Result] as INT)) and Count([Result])-Sum(Cast([Result] as INT)) instead of the CASE functions would also work here)

To get (as now) the following result:

Plant	Machine	Material	Batch	#Tests	#Parts	#OKTests	#NOKTests
A	MachineA	MaterialA	X	3	3	3	0
A	MachineA	MaterialA	Y	124	96	93	31
A	MachineB	MaterialA	X	11	9	9	2
A	MachineB	MaterialB	Y	21	13	11	10

and I want to add this table by columns #OKParts, #NOKParts, #PartsFirstTestOK, #PartsFirstTestNOK.

I've tried to create a sub-query, FIRST_VALUE & OVER function, but none of them worked for me. I am quite a starter in SQL.

Does anybody have an idea how to solve this?
Thanks a lot in advance!

答案1

得分: 1

以下是代码部分的翻译：

使用窗口化的ROW_NUMBER()函数为给定零件的测试系列分配升序和降序的序列号。检查其中一个值等于1可用于限制计数仅适用于第一个或最后一个测试。

SELECT Plant
      ,Machine
      ,Material
      ,Batch
      ,Count(Result) as [#Tests] --所有行
      ,Count(Distinct DMC) as [#Parts] --所有不同的DMC
      ,COUNT(CASE WHEN Result = 1 THEN 1 END) as [#OKTests]
      ,COUNT(CASE WHEN Result = 0 THEN 1 END) as [#NOKTests]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 1 THEN 1 END) as [#PartsFirstTestOK]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 0 THEN 1 END) as [#PartsFirstTestNOK]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 1 THEN 1 END) as [#OKParts]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 0 THEN 1 END) as [#NOKParts]
FROM (
    SELECT *,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime
            ) AS RowNumAsc,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime DESC
            ) AS RowNumDesc
    FROM table1
) T
where Plant = 'A' and ProductionTime > DATEADD(DAY, -365,GETDATE())
group by Plant, Material, Batch, Machine;

希望这对您有所帮助。

英文:

The following uses windowed ROW_NUMBER() functions to assign both ascending and descending sequence numbers to the series of tests for a given part. Checking for one or the other value = 1 can be used to limit your counts to just the first or last test.

SELECT Plant
      ,Machine
      ,Material
      ,Batch
      ,Count(Result) as [#Tests] --all rows
      ,Count(Distinct DMC) as [#Parts] --all different DMCs
      ,COUNT(CASE WHEN Result = 1 THEN 1 END) as [#OKTests]
      ,COUNT(CASE WHEN Result = 0 THEN 1 END) as [#NOKTests]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 1 THEN 1 END) as [#PartsFirstTestOK]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 0 THEN 1 END) as [#PartsFirstTestNOK]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 1 THEN 1 END) as [#OKParts]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 0 THEN 1 END) as [#NOKParts]
FROM (
    SELECT *,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime
            ) AS RowNumAsc,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime DESC
            ) AS RowNumDesc
    FROM table1
) T
where Plant = &#39;A&#39; and ProductionTime &gt; DATEADD(DAY, -365,GETDATE())
group by Plant, Material, Batch, Machine;

Sample results:

Plant	Machine	Material	Batch	#Tests	#Parts	#OKTests	#NOKTests	#PartsFirstTestOK	#PartsFirstTestNOK	#OKParts	#NOKParts
A	MachineA	MaterialA	X	3	2	2	1	1	1	2	0
A	MachineB	MaterialA	X	1	1	0	1	0	1	0	1
A	MachineA	MaterialA	Y	1	1	1	0	1	0	1	0
A	MachineB	MaterialB	Y	1	1	1	0	1	0	1	0

See this db<>fiddle for a working demo.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据另一列的顺序计算不同值的数量。

问题

答案1

统计历史表中每个组的加薪次数

有没有办法在SQLite中根据另一个查询的结果来限制查询？

Reuse or copy *sql.Rows in Golang

如何在PLSQL中从数据库表中获取没有特定列名的数据？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论