英文:
Simple aggregating query on indexed table with 14M rows taking long
问题
I'm having trouble running a simple aggregating query on a relatively large table (~14M rows). The query in question SELECT stageID, SUM(time) as totalTime FROM <table> GROUP BY stageID
results in 4052 rows and takes ~7 seconds. Running mysql from 11.0.2-MariaDB, client 15.2 for Win64 (AMD64) on a local machine with 16GB RAM, an i7 chip. The data is on an SSD drive so I don't understand the amount of time it takes to send the data..
使用 EXPLAIN EXTENDED SELECT ...
:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | <table> | index | NULL | uniquekey | 8 | NULL | 13744857 | 100.00 | NULL |
使用 EXPLAIN <table
>:
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
stageID | int(11) | NO | PRI | NULL | |
playerID | int(11) | NO | PRI | NULL | |
time | int(11) | NO | 0 |
Included indices:
UNIQUE KEY (stageID, playerID) FOREIGN KEY (stageID) FOREIGN KEY (playerID)
ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci
Here's the profile of that query:
Status | Duration |
---|---|
Starting | 0.000050 |
checking permissions | 0.000005 |
Opening tables | 0.000013 |
After opening tables | 0.000003 |
System lock | 0.000005 |
table lock | 0.000005 |
init | 0.000017 |
Optimizing | 0.000007 |
Statistics | 0.000021 |
Preparing | 0.000015 |
Sorting result | 0.000007 |
Executing | 0.000002 |
Sending data | 6.893190 |
End of update loop | 0.000010 |
Query end | 0.000003 |
Commit | 0.000004 |
closing tables | 0.000003 |
Unlocking tables | 0.000002 |
closing tables | 0.000007 |
Starting cleanup | 0.000002 |
Freeing items | 0.000003 |
Updating status | 0.000039 |
Reset for next command | 0.000004 |
Upgrading the resources available to MariaDB by using the my-huge.ini config files etc., no changes in execution time.
英文:
I'm having trouble running a simple aggregating query on a relatively large table (~14M rows). The query in question SELECT stageID, SUM(time) as totalTime FROM <table> GROUP BY stageID
results in 4052 rows and takes ~7 seconds. Running mysql from 11.0.2-MariaDB, client 15.2 for Win64 (AMD64) on a local machine with 16GB RAM, an i7 chip. The data is on an SSD drive so I don't understand the amount of time it takes to send the data..
Using EXPLAIN EXTENDED SELECT ...
:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | <table> | index | NULL | uniquekey | 8 | NULL | 13744857 | 100.00 | NULL |
Using EXPLAIN <table
>:
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
stageID | int(11) | NO | PRI | NULL | |
playerID | int(11) | NO | PRI | NULL | |
time | int(11) | NO | 0 |
Included indices:
UNIQUE KEY (stageID, playerID)
FOREIGN KEY (stageID)
FOREIGN KEY (playerID)
ENGINE=InnoDB
DEFAULT CHARSET=utf8mb3
COLLATE=utf8mb3_general_ci
Here's the profile of that query:
Status | Duration |
---|---|
Starting | 0.000050 |
checking permissions | 0.000005 |
Opening tables | 0.000013 |
After opening tables | 0.000003 |
System lock | 0.000005 |
table lock | 0.000005 |
init | 0.000017 |
Optimizing | 0.000007 |
Statistics | 0.000021 |
Preparing | 0.000015 |
Sorting result | 0.000007 |
Executing | 0.000002 |
Sending data | 6.893190 |
End of update loop | 0.000010 |
Query end | 0.000003 |
Commit | 0.000004 |
closing tables | 0.000003 |
Unlocking tables | 0.000002 |
closing tables | 0.000007 |
Starting cleanup | 0.000002 |
Freeing items | 0.000003 |
Updating status | 0.000039 |
Reset for next command | 0.000004 |
Upgrading the resources available to MariaDB by using the my-huge.ini config files etc., no changes in execution time.
答案1
得分: 2
以下是已翻译的内容:
"The 'Sending data' line in your profile output proves that almost all of the 7 seconds is spent copying rows of data between the storage engine layer and the SQL layer."
你的资料输出中的“发送数据”行证明了几乎所有的7秒都花在在存储引擎层和SQL层之间复制数据行上。
"The EXPLAIN for your query shows an estimate of examining 13744857 rows, and there's no amount of indexes or resources that can change that."
你的查询的解释显示了估计检查了13744857行,无论索引或资源多少都无法改变这一点。
"The only ways to speed up this query are:
-
Reduce the examined rows by using an indexed condition in the WHERE clause. But that would be a different query. I assume you need the query to sum all the rows for all stageIds.
-
Reduce the stored rows in this table by deleting or archiving some.
-
Use a summary table to record the sums and keep them updated as rows change in your primary table. It may still take 7 seconds to update the summary table, but you don't have to recalculate the sums every time you read them.
-
Get a faster computer."
加速此查询的唯一方法是:
-
通过在WHERE子句中使用索引条件来减少检查的行数。但那将是一个不同的查询。我假设你需要查询来汇总所有阶段ID的行。
-
通过删除或存档一些内容来减少此表中存储的行。
-
使用摘要表记录总数,并在主要表中的行发生更改时保持其更新。更新摘要表可能仍需要7秒,但您不必每次读取它们时重新计算总数。
-
获取一台更快的计算机。
英文:
The "Sending data" line in your profile output proves that almost all of the 7 seconds is spent copying rows of data between the storage engine layer and the SQL layer.
The EXPLAIN for your query shows an estimate of examining 13744857 rows, and there's no amount of indexes or resources that can change that.
The only ways to speed up this query are:
-
Reduce the examined rows by using an indexed condition in the WHERE clause. But that would be a different query. I assume you need the query to sum all the rows for all stageIds.
-
Reduce the stored rows in this table by deleting or archiving some.
-
Use a summary table to record the sums and keep them updated as rows change in your primary table. It may still take 7 seconds to update the summary table, but you don't have to recalculate the sums every time you read them.
-
Get a faster computer.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论