英文:
Group, then "ungroup" rows with additional columns giving the group information
问题
Context
假设我们有这个表格数据(在结尾处可以找到一个准备好的请求来创建它):
+--+-----+-----+--------+
|id|name |color|shape |
+--+-----+-----+--------+
|1 |john |blue |square |
|2 |mary |green|square |
|3 |anna |red |triangle|
|4 |bob |blue |square |
|5 |susan|blue |square |
|6 |frank|red |triangle|
+--+-----+-----+--------+
通过这个请求,可以按颜色和形状对行进行分组,更重要的是添加聚合信息:
SELECT
GROUP_CONCAT(name) AS names,
color,
shape,
COUNT(*) AS nb_duplicates
FROM temp_users
GROUP BY color, shape;
结果:
+--------------+-----+--------+-------------+
|names |color|shape |nb_duplicates|
+--------------+-----+--------+-------------+
|john,bob,susan|blue |square |3 |
|mary |green|square |1 |
|anna,frank |red |triangle|2 |
+--------------+-----+--------+-------------+
Problem
但是如何才能"解除"行的分组,以便有:
- 每个用户一行(至少带有其ID,其余可以连接);
- 在分组后添加的信息,特别是
nb_duplicates
和唯一的分组ID(可能是自动递增的)?
Expected output
+--+-----+-----+--------+-------------+------------------+
|id|name |color|shape |nb_duplicates|duplicate_group_id|
+--+-----+-----+--------+-------------+------------------+
|1 |john |blue |square |3 |1 |
|2 |mary |green|square |1 |2 |
|3 |anna |red |triangle|2 |3 |
|4 |bob |blue |square |3 |1 |
|5 |susan|blue |square |3 |1 |
|6 |frank|red |triangle|2 |3 |
+--+-----+-----+--------+-------------+------------------+
Similar question
我找到了一个类似的问题https://stackoverflow.com/questions/12128077/mysql-count-group-by-yet-return-all-results,我尝试了提议的方法:
SELECT
u.*,
dups.nb_duplicates
FROM temp_users u
INNER JOIN (
SELECT
u2.color,
u2.shape,
COUNT(*) AS nb_duplicates
FROM temp_users u2
GROUP BY color, shape
) AS dups ON u.color = dups.color AND u.shape = dups.shape;
但我从MySQL收到了这个错误:
> [HY000][1137] 无法重新打开表格:'u'
Example table creation request
只是为了那些想要快速复制表格的人:
DROP TABLE IF EXISTS temp_users;
CREATE TEMPORARY TABLE temp_users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20),
color VARCHAR(20),
shape VARCHAR(20)
);
INSERT INTO temp_users(name, color, shape) VALUES
('john', 'blue', 'square'),
('mary', 'green', 'square'),
('anna', 'red', 'triangle'),
('bob', 'blue', 'square'),
('susan', 'blue', 'square'),
('frank', 'red', 'triangle');
英文:
Context
Let's say we have this table data (see at the end for a ready-ot-use request to create it):
+--+-----+-----+--------+
|id|name |color|shape |
+--+-----+-----+--------+
|1 |john |blue |square |
|2 |mary |green|square |
|3 |anna |red |triangle|
|4 |bob |blue |square |
|5 |susan|blue |square |
|6 |frank|red |triangle|
+--+-----+-----+--------+
With this request, it's possible to group rows by color and shape, and more importantly to add aggregation information:
SELECT
GROUP_CONCAT(name) AS names,
color,
shape,
COUNT(*) AS nb_duplicates
FROM temp_users
GROUP BY color, shape;
Result:
+--------------+-----+--------+-------------+
|names |color|shape |nb_duplicates|
+--------------+-----+--------+-------------+
|john,bob,susan|blue |square |3 |
|mary |green|square |1 |
|anna,frank |red |triangle|2 |
+--------------+-----+--------+-------------+
Problem
But how is it possible to "ungroup" the rows, in order to have:
- one row per user (at least with its id, the rest can be joined);
- the information added after the grouping, especially
nb_duplicates
and a unique group id (maybe auto-incremented) ?
Expected output
+--+-----+-----+--------+-------------+------------------+
|id|name |color|shape |nb_duplicates|duplicate_group_id|
+--+-----+-----+--------+-------------+------------------+
|1 |john |blue |square |3 |1 |
|2 |mary |green|square |1 |2 |
|3 |anna |red |triangle|2 |3 |
|4 |bob |blue |square |3 |1 |
|5 |susan|blue |square |3 |1 |
|6 |frank|red |triangle|2 |3 |
+--+-----+-----+--------+-------------+------------------+
Similar question
I found a similar question https://stackoverflow.com/questions/12128077/mysql-count-group-by-yet-return-all-results, I tried the proposition:
SELECT
u.*,
dups.nb_duplicates
FROM temp_users u
INNER JOIN (
SELECT
u2.color,
u2.shape,
COUNT(*) AS nb_duplicates
FROM temp_users u2
GROUP BY color, shape
) AS dups ON u.color = dups.color AND u.shape = dups.shape;
But I got this error from MySql:
> [HY000][1137] Can't reopen table: 'u'
Example table creation request
Just for those who want to quicky reproduce the table:
DROP TABLE IF EXISTS temp_users;
CREATE TEMPORARY TABLE temp_users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20),
color VARCHAR(20),
shape VARCHAR(20)
);
INSERT INTO temp_users(name, color, shape) VALUES
('john', 'blue', 'square'),
('mary', 'green', 'square'),
('anna', 'red', 'triangle'),
('bob', 'blue', 'square'),
('susan', 'blue', 'square'),
('frank', 'red', 'triangle');
答案1
得分: 1
SELECT *,
COUNT(*) OVER (PARTITION BY color, shape) nb_duplicates,
DENSE_RANK() OVER (ORDER BY color, shape) duplicate_group_id
FROM temp_users
ORDER BY id;
id | name | color | shape | nb_duplicates | duplicate_group_id |
---|---|---|---|---|---|
1 | john | blue | square | 3 | 1 |
2 | mary | green | square | 1 | 2 |
3 | anna | red | triangle | 2 | 3 |
4 | bob | blue | square | 3 | 1 |
5 | susan | blue | square | 3 | 1 |
6 | frank | red | triangle | 2 | 3 |
英文:
SELECT *,
COUNT(*) OVER (PARTITION BY color, shape) nb_duplicates,
DENSE_RANK() OVER (ORDER BY color, shape) duplicate_group_id
FROM temp_users
ORDER BY id;
id | name | color | shape | nb_duplicates | duplicate_group_id |
---|---|---|---|---|---|
1 | john | blue | square | 3 | 1 |
2 | mary | green | square | 1 | 2 |
3 | anna | red | triangle | 2 | 3 |
4 | bob | blue | square | 3 | 1 |
5 | susan | blue | square | 3 | 1 |
6 | frank | red | triangle | 2 | 3 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论