英文:
Slow running query with table with ORDER BY and LIMIT
问题
以下是您提供的查询的翻译:
我有以下查询:
SELECT
m.id_ AS match_id,
mv.id_ AS match_version_id,
t.id_ AS tournament_id
FROM
test.match_version AS mv
JOIN
test.match_ AS m ON m.id_ = mv.match_id
JOIN
test.tournament AS t ON t.orig_id = m.tournament_orig_id
AND t.tour_id = m.tour_id
ORDER BY mv.id_ DESC
LIMIT 100;
当存在 `ORDER_BY` 时,运行时间大约为 4 秒,而比赛表有大约 100 万行,锦标赛表有大约 3 万行。如果删除 `ORDER_BY`,则运行时间为 0.002 秒。
以下是每个表的 `CREATE TABLE` 语句:
```sql
创建表 `match_`:
CREATE TABLE `match_` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`tournament_orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
KEY `ix__match__tour_id` (`tour_id`),
KEY `ix__match__tour_id__tournament_orig_id` (`tour_id`,`tournament_orig_id`),
KEY `ix__match__tournament_orig_id` (`tournament_orig_id`),
CONSTRAINT `fk__match__tournament` FOREIGN KEY (`tour_id`, `tournament_orig_id`) REFERENCES `tournament` (`tour_id`, `orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1666470 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
创建表 `match_version`:
```sql
CREATE TABLE `match_version` (
`id_` int NOT NULL AUTO_INCREMENT,
`match_id` int NOT NULL,
`generation_start` int NOT NULL DEFAULT '0',
`generation_end` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id_`),
KEY `ix__match_version__match_id` (`match_id`),
KEY `ix__match_version__match_id__generation_end` (`match_id`,`generation_end`),
KEY `ix__match_version__generation_end` (`generation_end`),
CONSTRAINT `fk__match_version__match_id` FOREIGN KEY (`match_id`) REFERENCES `match_` (`id_`)
) ENGINE=InnoDB AUTO_INCREMENT=1669206 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
创建表 `tournament`:
```sql
CREATE TABLE `tournament` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
UNIQUE KEY `uq__tournament_oc__tour_id__orig_id` (`tour_id`,`orig_id`),
KEY `ix__tournament_oc__tour_id` (`tour_id`),
KEY `ix__tournament_oc__orig_id` (`orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=30962 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
这是 EXPLAIN
结果:
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | t | NULL | index | uq__tournament_oc__tour_id__orig_id,ix__tournament_oc__tour_id,ix__tournament_oc__orig_id | uq__tournament_oc__tour_id__orig_id | 5 | NULL | 30900 | 100 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | m | NULL | ref | PRIMARY,ix__match__tour_id,ix__match__tour_id__tournament_orig_id,ix__match__tournament_orig_id | ix__match__tour_id__tournament_orig_id | 5 | test.t.tour_id,test.t.orig_id | 48 | 100 | Using index |
| 1 | SIMPLE | mv | NULL | ref | ix__match_version__match_id,ix__match_version__match_id__generation_end | ix__match_version__match_id | 4 | test.m.id_ | 1 | 100 | Using index |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
为什么存在 ORDER_BY
时性能明显较差?如何解决这个问题?
注意:值得一提的是,我知道没有 tournament_version.tournament_id
外键与 tournament.id_
相关联,但这是我目前使用的情况!
英文:
I have the following query:
SELECT
m.id_ AS match_id,
mv.id_ AS match_version_id,
t.id_ AS tournament_id
FROM
test.match_version AS mv
JOIN
test.match_ AS m ON m.id_ = mv.match_id
JOIN
test.tournament AS t ON t.orig_id = m.tournament_orig_id
AND t.tour_id = m.tour_id
ORDER BY mv.id_ DESC
LIMIT 100;
It is taking around 4 seconds to run where the match tables have circa 1m rows and the tournament table has circa 30k rows. If I remove the ORDER_BY
then it runs in 0.002 seconds.
Here are the CREATE TABLE
statements for each table:
CREATE TABLE `match_` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`tournament_orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
KEY `ix__match__tour_id` (`tour_id`),
KEY `ix__match__tour_id__tournament_orig_id` (`tour_id`,`tournament_orig_id`),
KEY `ix__match__tournament_orig_id` (`tournament_orig_id`),
CONSTRAINT `fk__match__tournament` FOREIGN KEY (`tour_id`, `tournament_orig_id`) REFERENCES `tournament` (`tour_id`, `orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1666470 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE `match_version` (
`id_` int NOT NULL AUTO_INCREMENT,
`match_id` int NOT NULL,
`generation_start` int NOT NULL DEFAULT '0',
`generation_end` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id_`),
KEY `ix__match_version__match_id` (`match_id`),
KEY `ix__match_version__match_id__generation_end` (`match_id`,`generation_end`),
KEY `ix__match_version__generation_end` (`generation_end`),
CONSTRAINT `fk__match_version__match_id` FOREIGN KEY (`match_id`) REFERENCES `match_` (`id_`)
) ENGINE=InnoDB AUTO_INCREMENT=1669206 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
CREATE TABLE `tournament` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
UNIQUE KEY `uq__tournament_oc__tour_id__orig_id` (`tour_id`,`orig_id`),
KEY `ix__tournament_oc__tour_id` (`tour_id`),
KEY `ix__tournament_oc__orig_id` (`orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=30962 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Here is the EXPLAIN
result:
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | t | NULL | index | uq__tournament_oc__tour_id__orig_id,ix__tournament_oc__tour_id,ix__tournament_oc__orig_id | uq__tournament_oc__tour_id__orig_id | 5 | NULL | 30900 | 100 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | m | NULL | ref | PRIMARY,ix__match__tour_id,ix__match__tour_id__tournament_orig_id,ix__match__tournament_orig_id | ix__match__tour_id__tournament_orig_id | 5 | test.t.tour_id,test.t.orig_id | 48 | 100 | Using index |
| 1 | SIMPLE | mv | NULL | ref | ix__match_version__match_id,ix__match_version__match_id__generation_end | ix__match_version__match_id | 4 | test.m.id_ | 1 | 100 | Using index |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
How come I'm getting considerably poorer performance when the ORDER_BY
is present? How would I solve this?
Note: Worth mentioning that I know it's not ideal that there is no tournament_version.tournament_id
foreign key that links to tournament.id_
but that's sadly what I'm working with atm!
答案1
得分: 1
你的查询似乎在mysql的ORDER BY和LIMIT上遇到了性能问题,查询速度较慢。不幸的是,这是mysql中缺少的优化。
请参考这里的附加详细信息,以了解为什么此查询速度较慢:https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/
如果这不起作用,您可以尝试重新构造查询,不包括ORDER BY和LIMIT两个关键字。正如您可能已经注意到的,如果您可以删除其中一个关键字,查询将再次变得快速。
英文:
Your query appears to be running into issues with the mysql order by limit slow rate look up. Unfortunately, this is a missing optimization in mysql.
See additional details here to understand why this query is slow. At the bottom of the page, it shows some strategies for how to speed up queries like this: https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/
If that doesn't work you can try to restructure your query to not include both ORDER BY and LIMIT. As you probably noticed from toying with your query, if you can remove either of those keywords, it will be fast again.
答案2
得分: 0
我认为你的索引是不错的,但有一些多余的部分需要去掉。
因为 id_
是主键 (PRIAMRY KEY
),所以在任何其他索引中以 id_
开头是不必要且低效的。
由于没有 WHERE
过滤条件,因此优化器可能会决定以 ORDER BY
存在的索引开始。这也意味着它会在获取100行后停止。EXPLAIN
不太可能显示这种停止。
一个外键 (FOREIGN KEY
) 包括两个部分:
- 一个索引(也可以通过
INDEX
构建,同样有效) - 在另一个表中进行动态检查以确保存在关联。
使用 ENGINE=InnoDB 时,主键 (PRIMARY KEY
) 会自动添加到每个辅助索引的末尾,因此在上述索引的末尾添加 id_
是多余的(可选的)。
提示:这可能实际上有助于性能:
当你有 INDEX(a,b)
(或 UNIQUE(a,b)
)时,不需要再添加 INDEX(a)
。这是多余的,浪费空间,并可能误导优化器使用较短的索引而不是复合索引。我在每个表定义中看到了这种情况。
如果 EXPLAIN
没有将 mv
列为第一个表,有多种方法可以强制它首先使用。
英文:
I think your indexes are good, but there are some extras you should get rid of.
Since id_
is the PRIAMRY KEY
, it is unnecessary and inefficient to start any other index with id_
.
There are no WHERE
filtering, so the Optimizer will probably decide to start with an index for the ORDER BY
, which exists. This also means that it will stop after 100 rows. The EXPLAIN
is not likely to show this stoppage.
A FOREIGN KEY
is two things:
- An index (which could be constructed via
INDEX
-- equally well) - A dynamic check in the other table that there is a link.
With ENGINE=InnoDB, the PRIMARY KEY
is silently tacked on the end of each secondary index. So it is redundant (optional) to add id_
at the end of the above indexes.
Tip. And this may actually help with performance:
When you have INDEX(a,b)
(or UNIQUE(a,b)
), do not also have INDEX(a)
. It is redundant, wastes space, and may trick the Optimizer into using the shorter index instead of the composite one. (I see one such case in each table definition.)
If the EXPLAIN
fails to list mv
as the first table, there are multiple ways to force it to be used first.
答案3
得分: 0
感谢 @tay_inc 提供的链接。此代码在0.04秒内执行:
SELECT
m.id_ AS match_id,
mv.id_ AS match_version_id,
t.id_ AS tournament_id
FROM
test.match_version AS mv
JOIN
test.match_ AS m ON m.id_ = mv.match_id
JOIN
test.tournament AS t ON t.orig_id = m.tournament_orig_id
AND t.tour_id = m.tour_id
JOIN
(
SELECT
mv.id_ AS match_version_id
FROM
test.match_version AS mv
ORDER BY mv.id_ DESC
LIMIT 100
) AS o ON o.match_version_id = mv.id_
ORDER BY mv.id_ DESC;
我唯一能看到的问题是,它获取了mv
中的最后100条记录,而不管它们是否可以与m
和t
连接。在这种情况下对我来说不是问题,但值得记住...
英文:
Thanks to @tay_inc for the link. This code executes in 0.04 seconds:
SELECT
m.id_ AS match_id,
mv.id_ AS match_version_id,
t.id_ AS tournament_id
FROM
test.match_version AS mv
JOIN
test.match_ AS m ON m.id_ = mv.match_id
JOIN
test.tournament AS t ON t.orig_id = m.tournament_orig_id
AND t.tour_id = m.tour_id
JOIN
(
SELECT
mv.id_ AS match_version_id
FROM
test.match_version AS mv
ORDER BY mv.id_ DESC
LIMIT 100
) AS o ON o.match_version_id = mv.id_
ORDER BY mv.id_ DESC;
The only issue I can see is that this takes the last 100 records in mv
regardless of whether they can be joined to m
and t
. Not an issue for me in this case but worth bearing in mind...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论