英文:
Query to get ids of entries which appear in arbitrary amount different lists
问题
| 产品编号 |
|--------|
| 2137 |
英文:
id | category_id | product_id | status |
---|---|---|---|
13 | 93 | 2137 | 1 |
14 | 94 | 2137 | 1 |
15 | 93 | 2138 | 2 |
16 | 94 | 2138 | 2 |
17 | 87 | 2128 | 1 |
18 | 94 | 2128 | 1 |
19 | 87 | 2139 | 2 |
20 | 94 | 2139 | 2 |
21 | 88 | 2132 | 1 |
22 | 93 | 2132 | 1 |
23 | 88 | 2140 | 2 |
24 | 93 | 2140 | 2 |
25 | 87 | 2137 | 1 |
26 | 87 | 2141 | 2 |
27 | 93 | 2136 | 1 |
28 | 93 | 2137 | 1 |
29 | 88 | 2134 | 1 |
30 | 88 | 2143 | 2 |
I have this kind of data presented to me. For my query I'm given a list of category ids.
Let's say I'm given three lists with
1. {93, 94}
2. {88, 87, 86}
3. {93}
Now I would need a query, which would give me product ids, which appear at least once in ALL of those lists and for which the status is 1. So for the example query the result should be:
product_id |
---|
2137 |
答案1
得分: 1
以下是翻译好的代码部分:
这应该适用于SQL Server 2016及更高版本。
CREATE TABLE table1
(
id INT,
category_id INT,
product_id INT,
status INT
)
INSERT INTO table1
(id, category_id, product_id, status)
VALUES
(13, 93, 2137, 1)
,(14, 94, 2137, 1)
,(15, 93, 2138, 2)
,(16, 94, 2138, 2)
,(17, 87, 2128, 1)
,(18, 94, 2128, 1)
,(19, 87, 2139, 2)
,(20, 94, 2139, 2)
,(21, 88, 2132, 1)
,(22, 93, 2132, 1)
,(23, 88, 2140, 2)
,(24, 93, 2140, 2)
,(25, 87, 2137, 1)
,(26, 87, 2141, 2)
,(27, 93, 2136, 1)
,(28, 93, 2137, 1)
,(29, 88, 2134, 1)
,(30, 88, 2143, 2)
CREATE TABLE Input
(IdLst varchar(100))
INSERT INTO Input (IdLst)
VALUES
('{93, 94}')
,('{88, 87, 86}')
,('{93}')
;WITH Categories AS (
SELECT CONVERT(INT, Value ) category_id
FROM Input
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE( IdLst, '{', ''), '}', ''), ',')
)
SELECT product_id
FROM Categories
INNER JOIN table1 ON table1.category_id = Categories.category_id
GROUP BY product_id
HAVING COUNT(1) = (SELECT COUNT(1) cntCategories FROM Categories )
希望对你有所帮助。如果有其他问题,请随时提问。
英文:
This should work for SQL Server 2016 and above.
CREATE TABLE table1
(
id INT,
category_id INT,
product_id INT,
status INT
)
INSERT INTO table1
(id, category_id, product_id, status)
VALUES
( 13, 93, 2137, 1)
,( 14, 94, 2137, 1)
,( 15, 93, 2138, 2)
,( 16, 94, 2138, 2)
,( 17, 87, 2128, 1)
,( 18, 94, 2128, 1)
,( 19, 87, 2139, 2)
,( 20, 94, 2139, 2)
,( 21, 88, 2132, 1)
,( 22, 93, 2132, 1)
,( 23, 88, 2140, 2)
,( 24, 93, 2140, 2)
,( 25, 87, 2137, 1)
,( 26, 87, 2141, 2)
,( 27, 93, 2136, 1)
,( 28, 93, 2137, 1)
,( 29, 88, 2134, 1)
,( 30, 88, 2143, 2)
CREATE TABLE Input
(IdLst varchar(100))
INSERT INTO Input (IdLst)
VALUES
('{93, 94}')
,('{88, 87, 86}')
,('{93}')
;WITH Categories AS (
SELECT CONVERT(INT, Value ) category_id
FROM Input
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE( IdLst, '{', ''), '}', ''), ',')
)
SELECT product_id
FROM Categories
INNER JOIN table1 ON table1.category_id = Categories.category_id
GROUP BY product_id
HAVING COUNT(1) = (SELECT COUNT(1) cntCategories FROM Categories )
答案2
得分: 1
任何解决方案的第一步都是将选择标准数据规范化为一个表格,格式为 { category_group_id, category_id },每行只有一个 category_id。有几种方法可以做到这一点,但在这里我使用了相对较新的 STRING_SPLIT 函数(与 Luis LL 相同)。这个规范化的标准可以加载到一个临时表中,也可以作为公共表达式(CTE)包含在下面的查询中。
一旦标准化了标准,就可以通过以下步骤解决实际问题:(1)按状态过滤输入数据,(2)与上述标准化的选择标准进行连接,(3)按产品 ID 进行分组,然后(4)计算匹配的不同 category group IDs 的数量。如果该数量与 category group IDs 的总数匹配(对于样本数据是三个),则表示匹配。
;WITH NormalizedCategoryIds AS (
SELECT C.category_group_id, CAST(S.Value AS INT) AS category_id
FROM CategoryIds C
CROSS APPLY STRING_SPLIT(
REPLACE(REPLACE(category_id_list, '{', ''), '}', ''),
',') S
)
SELECT D.product_id
FROM SampleData D
JOIN NormalizedCategoryIds C on C.category_id = D.category_id
WHERE D.status = 1
GROUP BY D.product_id
HAVING COUNT(DISTINCT C.category_group_id) = (SELECT COUNT(*) FROM CategoryIds)
如果我们从已经规范化的标准开始,HAVING
子句可以更改为:
HAVING COUNT(DISTINCT C.category_group_id)
= (SELECT COUNT(DISTINCT C2.category_group_id) FROM NormalizedCategoryIds C2)
该值也可以在查询之前计算。
样本结果:
product_id |
---|
2132 |
2137 |
尽管在最初发布的结果中没有,但在这里也包括了 2132,因为它与所有三个类别组匹配。第 93 行与类别组 1 和 3 匹配,第 88 条记录与类别组 2 匹配。
在这个 db<>fiddle 中有一个包含一些额外测试数据的工作演示。
英文:
The first step in any solution is to normalize the selection criteria data into a table of the form { category_group_id, category_id } with only one category_id for row. There are several ways to do this but I've used the relatively new STRING_SPLIT function here (same as Luis LL). This normalized criteria may be loaded into a temp table or included as a Common Table Expression (CTE) as is done below.
Once the criteria is normalized, the real problem can be solved by (1) filtering the input data by status, (2) joining it with the normalized selection criteria from above, (3) grouping by product ID, and then (4) counting the number of distinct category group IDs matched. If that count matches the total number of category group IDs (three for the sample data), we have a match.
;WITH NormalizedCategoryIds AS (
SELECT C.category_group_id, CAST(S.Value AS INT) AS category_id
FROM CategoryIds C
CROSS APPLY STRING_SPLIT(
REPLACE(REPLACE(category_id_list, '{', ''), '}', ''),
',') S
)
SELECT D.product_id
FROM SampleData D
JOIN NormalizedCategoryIds C on C.category_id = D.category_id
WHERE D.status = 1
GROUP BY D.product_id
HAVING COUNT(DISTINCT C.category_group_id) = (SELECT COUNT(*) FROM CategoryIds)
If we started with criteria that was already normalized, the HAVING
clause could be changed to:
HAVING COUNT(DISTINCT C.category_group_id)
= (SELECT COUNT(DISTINCT C2.category_group_id) FROM NormalizedCategoryIds C2)
That value could also be calculated ahead of the query.
Sample results:
product_id |
---|
2132 |
2137 |
Even though not in the original posted results, 2132 is also included here, because it matches all three category groups. The 93 row matches category groups 1 and 3 and the 88 record matches category group 2.
See this db<>fiddle for a working demo including some extra test data.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论