英文:
Returning rows that match a list of inputs
问题
我有一个存储食谱的SQL Server数据库,其中有一个视图返回以下内容:
ingredient_name, ingredient_plural_name, recipe_id, recipe_name, recipe_description, tag_name
(还有其他与此问题无关的信息)。
数据库的设置是,对于食谱中的每种配料和标签组合,该视图将返回一行新记录(我的测试条目有7种配料和2个标签,所以对于此食谱返回了14行),如下所示:
ingredient_name | ingredient_plural_name | tag_name | etc. |
---|---|---|---|
all purpose flour | NULL | dessert | 所有记录中的其他数据相同 |
all purpose flour | NULL | simple | |
baking powder | NULL | dessert | |
baking powder | NULL | simple | |
egg | eggs | dessert | |
egg | eggs | simple | |
milk | NULL | dessert | |
milk | NULL | simple | |
unsalted butter | NULL | dessert | |
unsalted butter | NULL | simple | |
vanilla extract | NULL | dessert | |
vanilla extract | NULL | simple | |
white sugar | NULL | dessert | |
white sugar | NULL | simple |
我想编写一个存储过程,只返回recipe_name
,recipe_description
和recipe_id
,如果标签和/或名称/复数名包含作为参数输入提供的所有值。
食谱可以有比指定的更多的标签和配料,但它必须包含所有指定的标签和配料。例如,如果我执行以下样本食谱:
EXEC search_include @includeTags= 'milk, eggs'
但如果我执行以下样本食谱,就不应返回结果:
EXEC search_include @includeTags= 'milk, eggs, cheese'
我是否漏掉了什么,或者这将会变得有些复杂?
我目前有这段代码:
PROCEDURE [dbo].[search_include]
(@includeTags varchar(max))
AS
BEGIN
SELECT DISTINCT
recipe_name, recipe_description, recipe_id
FROM
recipe_view
WHERE
tag_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_plural_name IN (SELECT VALUE FROM dbo.split(@includeTags))
END
它可以显示只包含在列表中的标签和配料的结果,但我需要它只返回与所有输入匹配的食谱结果。
我还尝试了这种方法:
BEGIN
CREATE TABLE #tags (tag varchar(100))
INSERT INTO #tags
SELECT VALUE FROM dbo.split(@includeTags)
SELECT recipe_id
FROM recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags)
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags)
DROP TABLE #tags
END
这是一些类似但不完全相同的问题,但无论我为参数输入什么内容,它都没有返回任何结果。
dbo.split
函数接受一个用逗号分隔的列表作为 varchar
字符串,并将每个项目作为新行返回到一个名为 "VALUE" 的伪表中。
我并不是特别擅长 SQL,所以我在努力理解如何正确实现这个功能。
英文:
I have a SQL Server database that stores recipes and in this database there is a view that returns
ingredient_name, ingredient_plural_name, recipe_id, recipe_name, recipe_description, tag_name
(among other information not pertinent to this question).
The way the database is set up, this view will return a new row for each combination of ingredient and tag in the recipe (my test entry has 7 ingredients and 2 tags, so it returns 14 rows for this recipe) like this:
ingredient_name | ingredient_plural_name | tag_name | etc. |
---|---|---|---|
all purpose flour | NULL | dessert | The rest of the data is the same in all records |
all purpose flour | NULL | simple | |
baking powder | NULL | dessert | |
baking powder | NULL | simple | |
egg | eggs | dessert | |
egg | eggs | simple | |
milk | NULL | dessert | |
milk | NULL | simple | |
unsalted butter | NULL | dessert | |
unsalted butter | NULL | simple | |
vanilla extract | NULL | dessert | |
vanilla extract | NULL | simple | |
white sugar | NULL | dessert | |
white sugar | NULL | simple |
I want to write a stored procedure that returns the recipe_name
, recipe_description
, and recipe_id
only if the tags and/or name/plural name contain all values provided as a parameter input.
The recipe can have more tags and ingredients than the ones specified, but it must contain all of the ones specified. For example, it should return this sample recipe if I do
EXEC search_include @includeTags= 'milk, eggs'
but NOT if I do
EXEC search_include @includeTags= 'milk, eggs, cheese'
Am I missing something, or is this going to be a bit convoluted?
I currently have this code:
PROCEDURE [dbo].[search_include]
(@includeTags varchar(max))
AS
BEGIN
SELECT DISTINCT
recipe_name, recipe_description, recipe_id
FROM
recipe_view
WHERE
tag_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_name IN (SELECT VALUE FROM dbo.split(@includeTags))
OR ingredient_plural_name IN (SELECT VALUE FROM dbo.split(@includeTags))
END
And it works to only show results where tags and ingredients are included in the list, but I need it to only return results of recipes that match ALL of the inputs.
I have also tried this approach:
BEGIN
CREATE TABLE #tags (tag varchar(100))
INSERT INTO #tags
SELECT VALUE FROM dbo.split(@includeTags)
SELECT recipe_id
FROM recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags)
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags)
DROP TABLE #tags
END
From some similar (but not quite the same) questions from this site. This one gives me no results whatsoever, no matter what I type for the parameters.
The dbo.split
function just takes a comma separated list as a varchar
string and returns a pseudo table of each item in a new row in a "VALUE" column.
I'm not great with SQL by any means, and I'm having some difficulty wrapping my head around getting this right.
答案1
得分: 0
我认为问题出在创建多个相同成分行的视图上。例如,如果您有两个标签,并且只是使用上面的SELECT进行选择,如下所示:
SELECT *
FROM #recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags);
它会显示4行
ingredient_name ingredient_plural_name tag_name recipe_id
egg eggs dessert cake
egg eggs simple cake
milk null dessert cake
milk null simple cake
因此,COUNT(*)
将返回4,而不是匹配HAVING子句(它期望计数为2)。
我建议
- 不要让'recipe_view'看起来像那样,而是将所有相关的成分/标签等合并到单个列中
- 进行与您上面的答案类似的匹配。
这是一个db<>fiddle,从recipe_view开始,将其转换为一个单列来检查(recipe_tag),然后进行检查。
我希望您有更好的方法来做到这一点(考虑到视图已经从其他地方选择数据),但我使用的'view'(实际上是一个名为'recipe_tags'的临时表)具有以下数据/结构 - 包括与食谱相关的所有标签/等等。
recipe_id recipe_tag
cake all purpose flour
cake baking powder
cake dessert
cake egg
cake eggs
cake milk
cake simple
cake unsalted butter
cake vanilla extract
cake white sugar
然后,您可以对此表执行与您上面所做的类似的方法
SELECT recipe_id
FROM #recipe_tags
INNER JOIN #tags ON #recipe_tags.recipe_tag = #tags.tag
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags);
这具有预期的输出 - recipe_id(在这种情况下为'cake')。
英文:
I think the issue is with the view that creates multiple rows for the same ingredient. For example, if you had the two tags and just did a SELECT with the above e.g.,
SELECT *
FROM #recipe_view
WHERE tag_name IN (SELECT tag FROM #tags)
OR ingredient_name IN (SELECT tag FROM #tags)
OR ingredient_plural_name IN (SELECT tag FROM #tags);
It will show 4 rows
ingredient_name ingredient_plural_name tag_name recipe_id
egg eggs dessert cake
egg eggs simple cake
milk null dessert cake
milk null simple cake
Therefore COUNT(*)
will return 4 - and not match the HAVING clause (which is expecting a count of 2).
I suggest
- Instead of having the 'recipe_view' look like that, instead you combine all relevant ingredients/tags etc into a single column
- Do a matching similar to your answer above.
Here is a db<>fiddle that starts with the recipe_view, converts it to a single column to check (recipe_tag
) then does the check.
I expect you have a better way of doing this (given that the view is already selecting data from elsewhere) but the 'view' I use (actually a temporary table called recipe_tags
) has the following data/structure - with all the tags/etc relevant to the recipe.
recipe_id recipe_tag
cake all purpose flour
cake baking powder
cake dessert
cake egg
cake eggs
cake milk
cake simple
cake unsalted butter
cake vanilla extract
cake white sugar
Then you can do a similar approach as you have above against this table
SELECT recipe_id
FROM #recipe_tags
INNER JOIN #tags ON #recipe_tags.recipe_tag = #tags.tag
GROUP BY recipe_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #tags);
This has the expected output - the recipe_id (in this case 'cake').
答案2
得分: 0
你需要的是一个查询,该查询:给定一份不重复的食谱列表,选择每个食谱,其中所有标签都包含在该食谱的成分中。
“在成分中所有标签都存在”的逻辑等同于“在成分中没有标签缺失”。这种看似双重否定的情况可以通过嵌套的 WHERE NOT EXISTS()
条件来实现。
类似这样:
SELECT R.*
FROM (
SELECT DISTINCT recipe_name, recipe_description, recipe_id
FROM recipe_view R
) R
WHERE NOT EXISTS(
SELECT *
FROM #tags T
WHERE NOT EXISTS(
SELECT *
FROM recipe_view I
WHERE I.recipe_id = R.recipe_id
AND (
T.tag = I.tag_name
OR T.tag = I.ingredient_name
OR T.tag = I.ingredient_plural_name
)
)
)
请查看 此数据库演示。
在此查询中反复使用视图可能效率较低。如果您可以访问底层表格,可能更明智地将上述查询重写为直接访问这些表格。
英文:
What you want is a query that: Given a distinct list of recipes, select each recipe where all tags are present in the ingredients for that recipe.
The "where all tags are present in the ingredients" is logically equivalent to "where none of the tags are missing from the ingredients". This seemingly double-negative can be achieved with nested WHERE NOT EXISTS()
conditions.
Something like:
SELECT R.*
FROM (
SELECT DISTINCT recipe_name, recipe_description, recipe_id
FROM recipe_view R
) R
WHERE NOT EXISTS(
SELECT *
FROM #tags T
WHERE NOT EXISTS(
SELECT *
FROM recipe_view I
WHERE I.recipe_id = R.recipe_id
AND (
T.tag = I.tag_name
OR T.tag = I.ingredient_name
OR T.tag = I.ingredient_plural_name
)
)
)
See this db<>fiddle.
Using the view repeatedly in this query can be inefficient. If you hae access to the underlying tables, it may be wise to rewrite the above to access those tables directly.
答案3
得分: 0
一个实现此功能的选项是联合所有需要匹配的三列,然后使用不同的计数来检查这三列是否包含所有标签值。
CREATE PROCEDURE search_include
@includeTags varchar(128)
AS
BEGIN
CREATE TABLE #tags (tag varchar(100));
INSERT INTO #tags
SELECT VALUE FROM STRING_SPLIT(REPLACE(@includeTags, ', ', ','), ',');
with rv as
(
SELECT recipe_id, ingredient_name as attribute_value
FROM recipe_view WHERE ingredient_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, ingredient_plural_name
FROM recipe_view WHERE ingredient_plural_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, tag_name
FROM recipe_view WHERE tag_name IN (SELECT tag FROM #tags)
)
SELECT recipe_id
FROM rv
GROUP BY recipe_id
HAVING COUNT(distinct attribute_value) = (SELECT COUNT(*) FROM #tags);
DROP TABLE #tags;
END
REPLACE('milk, dessert', ', ', ',')
函数中的使用是为了删除参数值中可能存在的空格,例如 'milk, eggs' -> 'milk,eggs'。
英文:
One option to achieve this functionality is to union all of the three columns that are required to match with, then use the distinct count to check if the three columns have all of the tag values.
CREATE PROCEDURE search_include
@includeTags varchar(128)
AS
BEGIN
CREATE TABLE #tags (tag varchar(100));
INSERT INTO #tags
SELECT VALUE FROM STRING_SPLIT(REPLACE(@includeTags, ', ', ','), ',');
with rv as
(
SELECT recipe_id, ingredient_name as attribute_value
FROM recipe_view WHERE ingredient_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, ingredient_plural_name
FROM recipe_view WHERE ingredient_plural_name IN (SELECT tag FROM #tags)
UNION ALL
SELECT recipe_id, tag_name
FROM recipe_view WHERE tag_name IN (SELECT tag FROM #tags)
)
SELECT recipe_id
FROM rv
GROUP BY recipe_id
HAVING COUNT(distinct attribute_value) = (SELECT COUNT(*) FROM #tags);
DROP TABLE #tags;
END
The use of REPLACE('milk, dessert', ', ',',')
inside the split function is to remove the space that may exist in the parameter value, i.e.'milk, eggs' -> 'milk,eggs'.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论