SQL查询以比较同一表内的两行。

huangapple go评论58阅读模式
英文:

SQL query to compare two rows within the same table

问题

在这里我有一个pizza

pizza_id toppings
1 1,2,3,4,5,6,8,10
2 4,6,7,9,11,12

我想知道每个pizza_id中共同的配料。在两个pizza_id中使用最多的配料是...

Pizza_id    toppings
1             4,6
2             4,6

我尝试过使用连接(JOINS),但无法满足条件。
请问有人能给我一些提示吗?
谢谢

英文:

Here I have pizza table

pizza_id toppings
1 1,2,3,4,5,6,8,10
2 4,6,7,9,11,12

I would like to know toppings with each pizza_id in common. Most used toppings in both the pizza_id..., expected answer as below table

Pizza_id    toppings
1             4,6
2             4,6

I have tried using JOINS but couldn't satisfy the condition.
Could anyone please give me hint.
Thank you

答案1

得分: 1

你的数据库没有遵循第一范式(1NF):每个表格单元必须包含一个单一数值。最好的做法是创建pizza_tabletopping_table,它们之间具有多对多的关系。这样,有一个表格包含pizza_id与每个配料相关联的信息。

Pizza表格如下:

pizza_id pizza_name
1 Margherita
2 Capricciosa

配料表格如下:

topping_id topping_name
1 Pomodoro
2 Mozzarella

多对多关系的表格如下:

pizza_id topping_id
1 1
1 2
... ...

在这个表格中,你可以执行所有需要的操作来获取你的数据。

英文:

Your db is not respecting 1NF:each table cell must contain a single value. The best way to do so is having pizza_table and topping_table with a N-to-N relationship. In this way there is a table containing the pizza_id related with EVERY topping it have.

Pizza table is formed as:

pizza_id pizza_name
1 Margherita
2 Capricciosa

Topping table is formed as:

topping_id topping_name
1 Pomodoro
2 Mozzarella

And N-to-N table will be:

pizza_id topping_id
1 1
1 2
... ...

In this table you can make all operation you need to get your data.

答案2

得分: 1

正如大家已经指出的,你不应该将逗号分隔的数值存储在单个单元格中。

但是,回答你的问题,假设你正在寻找所有披萨的配料CSV的交集,并且你有一个包含(topping_id,name)的配料表,你可以这样做:

SELECT
    p1.pizza_id AS p1_id,
    p2.pizza_id AS p2_id,
    GROUP_CONCAT(t.topping_id) AS toppings,
    COUNT(*) AS num
FROM pizzas p1
JOIN toppings t
    ON FIND_IN_SET(t.topping_id, REPLACE(p1.toppings, ', ', ','))
JOIN pizzas p2
    ON p1.pizza_id < p2.pizza_id
    AND FIND_IN_SET(t.topping_id, REPLACE(p2.toppings, ', ', ','))
GROUP BY p1.pizza_id, p2.pizza_id
ORDER BY num DESC;

给定这些披萨:

pizza_id toppings
1 1, 2, 3, 4, 5, 6, 8, 10
2 4, 6, 7, 9, 11, 12
3 1, 6

上面的查询将返回:

p1_id p2_id toppings num
1 2 4,6 2
1 3 1,6 2
2 3 6 1

这是非常低效的,更好的方法是使用ElNicho建议的联接表。

如果你转而使用一个联接(N对N)表,比如pizzas_toppings (pizza_id, topping_id),查询变成了:

SELECT
    p1.pizza_id AS p1_id,
    p2.pizza_id AS p2_id,
    GROUP_CONCAT(p1.topping_id) AS toppings,
    COUNT(*) AS num
FROM pizzas_toppings p1
JOIN pizzas_toppings p2
    ON p1.pizza_id < p2.pizza_id
    AND p1.topping_id = p2.topping_id
GROUP BY p1.pizza_id, p2.pizza_id
ORDER BY num DESC;

确保你的联接表在两个方向上都有索引:

CREATE TABLE `pizzas_toppings` (
    pizza_id INT UNSIGNED NOT NULL,
    topping_id INT UNSIGNED NOT NULL,
    PRIMARY KEY (pizza_id, topping_id),
    INDEX (topping_id, pizza_id),
    FOREIGN KEY (pizza_id) REFERENCES pizzas (pizza_id),
    FOREIGN KEY (topping_id) REFERENCES toppings (topping_id)
);
英文:

As already pointed out by everyone, you should not be storing comma separated values in a single cell like that.

But, to answer your question, assuming you are looking for the intersection of the toppings CSV for all pizzas, and you have a toppings table with (topping_id, name), you could do something like:

SELECT
    p1.pizza_id AS p1_id,
    p2.pizza_id AS p2_id,
    GROUP_CONCAT(t.topping_id) AS toppings,
    COUNT(*) AS num
FROM pizzas p1
JOIN toppings t
    ON FIND_IN_SET(t.topping_id, REPLACE(p1.toppings, &#39;, &#39;, &#39;,&#39;))
JOIN pizzas p2
    ON p1.pizza_id &lt; p2.pizza_id
    AND FIND_IN_SET(t.topping_id, REPLACE(p2.toppings, &#39;, &#39;, &#39;,&#39;))
GROUP BY p1.pizza_id, p2.pizza_id
ORDER BY num DESC;

Given these pizzas:

pizza_id toppings
1 1, 2, 3, 4, 5, 6, 8, 10
2 4, 6, 7, 9, 11, 12
3 1, 6

The above query will return:

p1_id p2_id toppings num
1 2 4,6 2
1 3 1,6 2
2 3 6 1

This is insanely inefficient and would be much better served by the junction table suggested by ElNicho.

If you switch to using a junction (N-to-N) table like pizzas_toppings (pizza_id, topping_id), the query becomes:

SELECT
    p1.pizza_id AS p1_id,
    p2.pizza_id AS p2_id,
    GROUP_CONCAT(p1.topping_id) AS toppings,
    COUNT(*) AS num
FROM pizzas_toppings p1
JOIN pizzas_toppings p2
    ON p1.pizza_id &lt; p2.pizza_id
    AND p1.topping_id = p2.topping_id
GROUP BY p1.pizza_id, p2.pizza_id
ORDER BY num DESC;

Make sure your junction table is indexed in both directions:

CREATE TABLE `pizzas_toppings` (
    pizza_id INT UNSIGNED NOT NULL,
    topping_id INT UNSIGNED NOT NULL,
    PRIMARY KEY (pizza_id, topping_id),
    INDEX (topping_id, pizza_id),
    FOREIGN KEY (pizza_id) REFERENCES pizzas (pizza_id),
    FOREIGN KEY (topping_id) REFERENCES toppings (topping_id)
);

huangapple
  • 本文由 发表于 2023年2月27日 19:31:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75579913.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定