根据另一列的顺序计算不同值的数量。

huangapple go评论58阅读模式
英文:

Count distinct based on order of another column

问题

我有一个包含测量数据的表,它看起来像这样:

SELECT [Plant] --工厂名称
       ,[Machine] --机器名称
       ,[Material] --材料名称
       ,[Batch] --批次名称
       ,[DMC] --数据矩阵代码,每个部件都有一个
       ,[ProductionTime] --时间戳(UTC)
       ,[Result] --0/1表示NOK/OK
FROM [dbo].[Results]

每一行都代表一次测量,具有一个结果。每个被测试的零件都有一个DMC。由于零件可以被多次测试,因此在此表中会出现重复的DMC(具有不同的结果)。

示例数据:

工厂 机器 材料 批次 DMC 生产时间 结果
A 机器A 材料A X ABC 2023-02-16 16:21:52 1
A 机器A 材料A X DEF 2023-02-16 16:21:30 1
A 机器A 材料A X DEF 2023-02-16 16:21:09 0
A 机器A 材料A Y GHI 2023-02-16 16:20:47 1
A 机器B 材料A X JKL 2023-02-16 16:20:24 0
A 机器B 材料B Y MNO 2023-02-16 16:20:03 1

要计算废品率,我需要计算OK和NOK零件的数量。
对于#OKParts,条件是:
-#OKParts:对于每个DMC,考虑其最新的生产时间,然后计算结果等于1的行数
(-#NOKParts相同,但结果等于0)

此外,为了计算FTT-Yield(“第一次通过”):
-#PartsFirstTestOK:对于每个DMC,考虑其最旧的生产时间,然后计算结果等于1的行数
(-#PartsFirstTestNOK:相同,但结果等于0)

由于这个表非常庞大,我需要通过筛选表中的数据(请参阅下面的where语句)来减少数据量,并对结果进行分组。

现在,我的查询如下所示:

SELECT [Plant]
      ,[Machine]
      ,[Material]
      ,[Batch]
      ,Count([Result]) as '#Tests' --所有行
      ,Count(Distinct [DMC]) as '#Parts' --所有不同的DMC
      ,COUNT(CASE when [Result] = 1 THEN 1 END) as '#OKTests'
      ,COUNT(CASE when [Result] = 0 THEN 1 END) as '#NOKTests'
  FROM table
  where [Plant] = 'A' and [ProductionTime] > DATEADD(DAY, -365,GETDATE())
  group by [Plant],[Material],[Batch],[Machine]

(在这里,用Sum(Cast([Result] as INT))和Count([Result])-Sum(Cast([Result] as INT))来代替CASE函数也可以)

以获得(如现在)以下结果:

工厂 机器 材料 批次 #Tests #Parts #OKTests #NOKTests
A 机器A 材料A X 3 3 3 0
A 机器A 材料A Y 124 96 93 31
A 机器B 材料A X 11 9 9 2
A 机器B 材料B Y 21 13 11 10

我想要通过添加这些列的方式来获得这个表#OKParts,#NOKParts,#PartsFirstTestOK,#PartsFirstTestNOK。

我尝试创建一个子查询、使用FIRST_VALUE和OVER函数,但都没有成功。我在SQL方面是初学者。

有人有解决方法吗?非常感谢!

英文:

I have a table containing measurements, it looks like this:

SELECT [Plant] --plant name
       ,[Machine] --machine name
       ,[Material] --material name
       ,[Batch] --batch name
       ,[DMC] --data matrix code, each part has one
       ,[ProductionTime] --time stamp UTC
       ,[Result] --0/1 as NOK/OK
FROM [dbo].[Results]

Each row is one measurement, having a result. Each tested part has a DMC. Since parts can be tested several times, we have duplicated DMCs (with different results) in this table.

Example data:

Plant Machine Material Batch DMC ProductionTime Result
A MachineA MaterialA X ABC 2023-02-16 16:21:52 1
A MachineA MaterialA X DEF 2023-02-16 16:21:30 1
A MachineA MaterialA X DEF 2023-02-16 16:21:09 0
A MachineA MaterialA Y GHI 2023-02-16 16:20:47 1
A MachineB MaterialA X JKL 2023-02-16 16:20:24 0
A MachineB MaterialB Y MNO 2023-02-16 16:20:03 1

To calculate the scrap rate, I need to count the amount of OK and NOK parts.
For #OKParts, conditions are:
-#OKParts: for each DMC, consider its **latest **production time and calculate how many rows have result = 1
(-#NOKParts same but result = 0)

Additionally for calculating FTT-Yield ("first time through"):
-#PartsFirstTestOK: for each DMC, consider its oldest production time and calculate how many rows have result = 1
(-#PartsFirstTestNOK: same but result = 0)

Since this table is very big, I need to reduce the amount of data by filtering the table (see below the where statement) and group the results.

Right now, my query looks like this:

SELECT [Plant]
      ,[Machine]
      ,[Material]
      ,[Batch]
      ,Count([Result]) as '#Tests' --all rows
      ,Count(Distinct [DMC]) as '#Parts' --all different DMCs
      ,COUNT(CASE when [Result] = 1 THEN 1 END) as '#OKTests'
      ,COUNT(CASE when [Result] = 0 THEN 1 END) as '#NOKTests'
  FROM table
  where [Plant] = 'A' and [ProductionTime] > DATEADD(DAY, -365,GETDATE())
  group by [Plant],[Material],[Batch],[Machine]

(Sum(Cast([Result] as INT)) and Count([Result])-Sum(Cast([Result] as INT)) instead of the CASE functions would also work here)

To get (as now) the following result:

Plant Machine Material Batch #Tests #Parts #OKTests #NOKTests
A MachineA MaterialA X 3 3 3 0
A MachineA MaterialA Y 124 96 93 31
A MachineB MaterialA X 11 9 9 2
A MachineB MaterialB Y 21 13 11 10

and I want to add this table by columns #OKParts, #NOKParts, #PartsFirstTestOK, #PartsFirstTestNOK.

I've tried to create a sub-query, FIRST_VALUE & OVER function, but none of them worked for me. I am quite a starter in SQL.

Does anybody have an idea how to solve this?
Thanks a lot in advance!

答案1

得分: 1

以下是代码部分的翻译:

使用窗口化的ROW_NUMBER()函数为给定零件的测试系列分配升序和降序的序列号。检查其中一个值等于1可用于限制计数仅适用于第一个或最后一个测试。

SELECT Plant
      ,Machine
      ,Material
      ,Batch
      ,Count(Result) as [#Tests] --所有行
      ,Count(Distinct DMC) as [#Parts] --所有不同的DMC
      ,COUNT(CASE WHEN Result = 1 THEN 1 END) as [#OKTests]
      ,COUNT(CASE WHEN Result = 0 THEN 1 END) as [#NOKTests]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 1 THEN 1 END) as [#PartsFirstTestOK]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 0 THEN 1 END) as [#PartsFirstTestNOK]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 1 THEN 1 END) as [#OKParts]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 0 THEN 1 END) as [#NOKParts]
FROM (
    SELECT *,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime
            ) AS RowNumAsc,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime DESC
            ) AS RowNumDesc
    FROM table1
) T
where Plant = 'A' and ProductionTime > DATEADD(DAY, -365,GETDATE())
group by Plant, Material, Batch, Machine;

希望这对您有所帮助。

英文:

The following uses windowed ROW_NUMBER() functions to assign both ascending and descending sequence numbers to the series of tests for a given part. Checking for one or the other value = 1 can be used to limit your counts to just the first or last test.

SELECT Plant
      ,Machine
      ,Material
      ,Batch
      ,Count(Result) as [#Tests] --all rows
      ,Count(Distinct DMC) as [#Parts] --all different DMCs
      ,COUNT(CASE WHEN Result = 1 THEN 1 END) as [#OKTests]
      ,COUNT(CASE WHEN Result = 0 THEN 1 END) as [#NOKTests]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 1 THEN 1 END) as [#PartsFirstTestOK]
      ,COUNT(CASE when RowNumAsc = 1 AND Result = 0 THEN 1 END) as [#PartsFirstTestNOK]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 1 THEN 1 END) as [#OKParts]
      ,COUNT(CASE when RowNumDesc = 1 AND Result = 0 THEN 1 END) as [#NOKParts]
FROM (
    SELECT *,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime
            ) AS RowNumAsc,
        ROW_NUMBER() OVER(
            PARTITION BY Plant, Material, Batch, Machine, DMC
            ORDER BY ProductionTime DESC
            ) AS RowNumDesc
    FROM table1
) T
where Plant = 'A' and ProductionTime > DATEADD(DAY, -365,GETDATE())
group by Plant, Material, Batch, Machine;

Sample results:

Plant Machine Material Batch #Tests #Parts #OKTests #NOKTests #PartsFirstTestOK #PartsFirstTestNOK #OKParts #NOKParts
A MachineA MaterialA X 3 2 2 1 1 1 2 0
A MachineB MaterialA X 1 1 0 1 0 1 0 1
A MachineA MaterialA Y 1 1 1 0 1 0 1 0
A MachineB MaterialB Y 1 1 1 0 1 0 1 0

See this db<>fiddle for a working demo.

huangapple
  • 本文由 发表于 2023年3月9日 23:55:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75687055.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定