英文:
Performance drop upgrading from MySQL 5.7.33 to 8.0.31 - why did it stop using the Index Condition Pushdown Optimization?
问题
我有这样的表格(为了可读性省略了细节):
CREATE TABLE UserData (
id bigint NOT NULL AUTO_INCREMENT,
userId bigint NOT NULL DEFAULT '0', ...
c6 int NOT NULL DEFAULT '0', ...
hidden int NOT NULL DEFAULT '0', ...
c22 int NOT NULL DEFAULT '0', ...
PRIMARY KEY (id), ...
KEY userId_hidden_c6_c22_idx (userId,hidden,c6,c22), ...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
在MySQL 5.7中,我很高兴地这样查询它:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (0.03 sec)
然而,在MySQL 8.0中,这些查询开始变成这样:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (1.56 sec)
Explain显示了以下内容,5.7:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 78062 | 100.00 | Using index condition; Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
8.0:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 79298 | 100.00 | Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
主要区别似乎在于5.7中是“Using index condition; Using filesort”,而8.0中只有“Using filesort”。
编辑:为什么在MySQL 8.0中性能下降了10-100倍?看起来是因为它停止使用了索引条件推送优化 - 如何使它重新开始使用它?
表格中大约有150百万行数据,而该用户有大约75,000条记录,所以我猜测这可能是基于大小的启发式算法或MySQL决策的一些改变?
英文:
I have a table like this (details elided for readability):
CREATE TABLE UserData (
id bigint NOT NULL AUTO_INCREMENT,
userId bigint NOT NULL DEFAULT '0', ...
c6 int NOT NULL DEFAULT '0', ...
hidden int NOT NULL DEFAULT '0', ...
c22 int NOT NULL DEFAULT '0', ...
PRIMARY KEY (id), ...
KEY userId_hidden_c6_c22_idx (userId,hidden,c6,c22), ...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
and was happily doing queries on it like this in MySQL 5.7:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (0.03 sec)
However, in MySQL 8.0 these queries started doing this:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (1.56 sec)
Explain shows the following, 5.7:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 78062 | 100.00 | Using index condition; Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
8.0:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 79298 | 100.00 | Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
The main difference seems to be that 5.7 is Using index condition; Using filesort
and 8.0 is only Using filesort
.
<s>Why did 8.0 stop using the index condition, and how can I get it to start using it?</s>
EDIT: Why did performance drop 10-100x with MySQL 8.0? It looks like it's because it stopped using the Index Condition Pushdown Optimization - how can I get it to start using it?
The table has ~150M rows in it, and that user has ~75k records, so I guess it could be a change in the size-based heuristics or whatever goes into the MySQL decision making?
答案1
得分: 2
在你展示的EXPLAIN中,type
列是ref
,key
列指定了索引的名称,这表明它正在使用该索引来优化查询。
关于“index condition”的说明似乎被你误解了。诚然,如果该备注缺失,它似乎是在讨论“使用索引”与“不使用索引”。但关于“index condition”的说明实际上是在提到索引条件推送,它与使用索引无关,而是指的是将其他条件委托给存储引擎层进行过滤。你可以在这里了解更多信息:https://dev.mysql.com/doc/refman/8.0/en/index-condition-pushdown-optimization.html
遗憾的是,EXPLAIN报告的说明确实很难理解。要真正理解如何阅读这些说明,你需要仔细研究大量的文档。
英文:
In the EXPLAIN you show, the type
column is ref
and the key
column names the index, which indicates it is using that index to optimize the lookup.
You are making an incorrect interpretation of what "index condition" means in the extra column. Admittedly it does sound like "using the index" versus not using the index if that note is absent.
The note about "index condition" is referring to Index Condition Pushdown, which is not related to using the index, but it's about delegating other conditions to be filtered at the storage engine level. Read about it here: https://dev.mysql.com/doc/refman/8.0/en/index-condition-pushdown-optimization.html
It's unfortunate that the notes reported by EXPLAIN are so difficult to understand. You really have to study a lot of documentation to understand how to read those notes.
答案2
得分: 0
这在任何版本中都会更快,因为它将在10行之后停止。也就是说,将避免"filesort"。
索引:(userId、hidden、id)
这不会显示"Using index"(又名"covering"),但你的尝试也没有。这与"Using index condition"(又名"ICP",正如你指出的)是不同的。
尝试使用以下方法获取更多信息:
EXPLAIN FORMAT_JSON SELECT ...
EXPLAIN ANALYZE SELECT ...
(不,我不能解释这个回归。)
英文:
This would be much faster in either version because it would stop after 10 rows. That is, the "filesort" would be avoided.
INDEX(userId, hidden, id)
This won't do "Using index" (aka "covering"), but neither did your attempts. That is different from "Using index condition" (aka "ICP", as you point out).
Try these to get more insight:
EXPLAIN FORMAT_JSON SELECT ...
EXPLAIN ANALYZE SELECT ...
(No, I cannot explain the regression.)
答案3
得分: 0
问题实际上应该是:“为什么在5.7版本中使用ICP”?
唯一可能被“推送”到索引的索引条件必须从条件“userId = 123 AND hidden = 0”中提取出来。
这个条件已经通过在userId_hidden_c6_c22_idx上使用“ref”访问类型来完全评估,而“ref”是“const,const”。(指的是使用的“const”文字[123,0])
因此,在索引上添加条件“userId = 123 AND hidden = 0”不会提高选择性。
8.0版本中的优化器已经改进,可以检测到这种冗余,因此不会将其添加为ICP。
英文:
The question should really be: 'Why did it use ICP in 5.7'?
The only index condition which could have been 'pushed' to the index has to be extracted from the condition 'userId = 123 AND hidden = 0'.
That condition is already completely evaluate by using the 'ref' access type on userId_hidden_c6_c22_idx, with the 'ref' being "const,const". (Refering to that the 'const' literals [123,0] being used.
Thus, adding the condition 'userId = 123 AND hidden = 0' on the index does not improve the selectivity.
The optimizer in 8.0 has been improved to detect this redundanzy, thus does not add it as an ICP.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论