SQL COUNT DISTINCT with condition based on another column

huangapple go评论71阅读模式
英文:

SQL COUNT DISTINCT with condition based on another column

问题

Here is the SQL query that should give you the desired result:

SELECT station, error, MAX(num_parts) AS num_parts
FROM (
    SELECT station, error, uniquepart_id, COUNT(uniquepart_id) AS num_parts
    FROM Tablename
    WHERE (process_date= 'xx-xx-xxxx')
    GROUP BY station, error, uniquepart_id
) AS subquery
GROUP BY station, error

This query first calculates the count of uniquepart_id for each station, error, and uniquepart_id combination. Then, it uses a subquery to find the maximum count for each station and error combination. Finally, it groups the results by station and error to achieve the desired output.

英文:

I'm stuck with this problem and i don't find a solution to it.

I have a table and I want to count the number of parts for each station and error type, but if a part have multiple errors in the same station just count the error which lexicographically is highest.

The data looks like this:

| station     |   error   |   uniquepart_id   |
| ----------- | --------- | ----------------- |
| A           | ERR_01    | 0001              |
| A           | ERR_01    | 0001              |
| A           | ERR_02    | 0002              |
| A           | ERR_02    | 0002              |
| A           | ERR_03    | 0001              |
| A           | ERR_03    | 0002              |
| A           | ERR_03    | 0003              |
| A           | ERR_03    | 0004              |
| B           | ERR_01    | 0005              |
| B           | ERR_01    | 0006              |
| B           | ERR_02    | 0007              |
| B           | ERR_02    | 0008              |
| B           | ERR_03    | 0009              |
| B           | ERR_03    | 0010              |
| B           | ERR_03    | 0011              |
| B           | ERR_03    | 0012              |

I wrote the following query:

SELECT station, error, COUNT(DISTINCT uniquepart_id) AS num_parts
       FROM Tablename
       WHERE (process_date= 'xx-xx-xxxx')
       GROUP BY station, error

I'm getting this result:

station error num_parts
A ERR_01 1
A ERR_02 1
A ERR_03 4
B ERR_01 2
B ERR_02 2
B ERR_03 4

and I'm looking for this:

station error num_parts
A ERR_03 4
B ERR_01 2
B ERR_02 2
B ERR_03 4

I tried to use MAX and HAVING to filter the rows within each of the groups, but I'm getting syntax errors. I think with an inner query could be solved.

答案1

得分: 2

首先(内部查询),将您的数据折叠为每个部件每个站点的一个错误代码。

然后按照您一直进行的方式处理。

SELECT
  station, error, COUNT(*) AS num_parts
FROM
(
  SELECT station, MAX(error) AS error, uniquepart_id
    FROM Tablename
   WHERE process_date = 'xx-xx-xxxx'
GROUP BY station, uniquepart_id
)
  AS station_error_per_part
GROUP BY
  station, error
ORDER BY
  station, error
英文:

Process your data in two stages.

First (inner query) collapse your data to one error code per part per station.

Then process as you have been.

SELECT
  station, error, COUNT(*) AS num_parts
FROM
(
  SELECT station, MAX(error) AS error, uniquepart_id
    FROM Tablename
   WHERE process_date = 'xx-xx-xxxx'
GROUP BY station, uniquepart_id
)
  AS station_error_per_part
GROUP BY
  station, error
ORDER BY
  station, error

答案2

得分: -1

从
(
选择 * 
从
(
选择 station, error, COUNT(DISTINCT uniquepart_id) AS num_parts
  , dense_rank()over(partition by error order by COUNT(DISTINCT uniquepart_id) desc) rn
从 test
-- WHERE (process_date= 'xx-xx-xxxx')
GROUP BY station, error
) t
其中 rn=1
order by station, error
) t
更新1:@MatBailie,感谢您对错误的提醒。
从
(
选择 * 
从
(
选择 station, error, COUNT(*) AS num_parts
  , dense_rank()over(partition by station,error order by COUNT(*) desc) rn
从 (选择 station,uniquepart_id,max(error) error
    从 test
    -- WHERE (process_date= 'xx-xx-xxxx')
    GROUP BY station,uniquepart_id
  ) p
GROUP BY station, error
) t
其中 rn=1
order by station, error
) t

rn值当然是多余的。

英文:
select * from
(
SELECT station, error, COUNT(DISTINCT uniquepart_id) AS num_parts
  , dense_rank()over(partition by error order by COUNT(DISTINCT uniquepart_id) desc) rn
FROM test
-- WHERE (process_date= 'xx-xx-xxxx')
GROUP BY station, error
) t
where rn=1
order by station, error

Update1: @MatBailie, thank you for a note about the error.

select * from
(
SELECT station, error, COUNT(*) AS num_parts
  , dense_rank()over(partition by station,error order by COUNT(*) desc) rn
FROM (SELECT station,uniquepart_id,max(error) error
    FROM test
    -- WHERE (process_date= 'xx-xx-xxxx')
    GROUP BY station,uniquepart_id
  ) p
GROUP BY station, error
) t
where rn=1
order by station, error

The rn value is of course superfluous

huangapple
  • 本文由 发表于 2023年5月7日 05:30:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76191239.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定