MySQL 8.0.33 error when selecting json column : Out of sort memory, consider increasing server sort buffer size

huangapple go评论54阅读模式
英文:

MySQL 8.0.33 error when selecting json column : Out of sort memory, consider increasing server sort buffer size

问题

我有一个包含JSON列的表。我想选择JSON列中的特定数据。
我的查询在MySQL 5.7(Ubuntu 16.04)中正常工作,但在MySQL 8.0.33(Ubuntu 22.04)中不起作用:

查询:

select json_unquote(json_extract(`campaigns`.`model`, ''$.\"migration\"')) as `migration` 
from `campaigns` 
order by `created_at` desc 
limit 10 offset 0;

错误:

Out of sort memory, consider increasing server sort buffer size

sort_buffer_size变量当前为:
SHOW global variables LIKE ''sort_buffer_size';
结果:262144

我尝试增加sort_buffer_size并重新启动服务器,但没有成功。

这是MySQL 8中的一个错误吗?我应该以更优化的方式重写我的查询吗?

谢谢。

英文:

I have a table with a JSON column. I want to select a specific data within the JSON column.
My query was working fine with MySQL 5.7 (Ubuntu 16.04), but does not work with MySQL 8.0.33 (Ubuntu 22.04) :

Query :

select json_unquote(json_extract(`campaigns`.`model`, '$.\"migration\"')) as `migration` 
from `campaigns` 
order by `created_at` desc 
limit 10 offset 0;

Error :
> Out of sort memory, consider increasing server sort buffer size

The sort_buffer_size variable is currently :
SHOW global variables LIKE 'sort_buffer_size';
Result : 262144

I tried to increase the sort_buffer_size and restart the server, but without success.

Is this a bug in mysql 8 ? should I rewrite my query in a more optimized form ?

thanks

答案1

得分: 1

这似乎是一个错误。这是 MySQL 8.0.20 中一项假定改进的结果。请查看 https://bugs.mysql.com/bug.php?id=103225

最佳的解决方法是确保你的查询通过在 created_at 列上创建索引来优化排序。如果查询以索引顺序读取行,而这与你想要的顺序相同,那么它将不需要使用排序缓冲区。

如果无法创建该索引,MySQL 必须以某种其他顺序读取行(可能是主键顺序),并且必须对行进行分批排序。这些批次存储在内存中的排序缓冲区中,排序缓冲区的大小必须足够大以容纳多行。默认的排序缓冲区大小相当适中,如你所发现的是256KB。

你说你尝试过增加它,但如果你的 JSON 文档很大,你将不得不大幅增加排序缓冲区的大小。我猜测要比你在表中存储的最大 JSON 文档大10-20倍。

你可以检查你的最大 JSON 文档有多大(至少是当前存储在你的表中的文档中的最大文档):

SELECT MAX(JSON_STORAGE_SIZE(model)) FROM campaigns; 

这里是 MySQL 发布说明中关于这个改变的解释:

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-20.html 说:

以前,将排序操作的有效负载中的任何 TINYBLOBBLOB 类型大于的列作为升序操作的一部分导致服务器仅回退到仅对行ID进行排序,而不是完整行;这导致在排序完成后从磁盘获取行本身需要进行第二次传递。由于 JSONGEOMETRY 列在内部实现为 LONGBLOB,这在这些列类型的情况下引起了相同的行为,尽管它们几乎总是远远小于 LONGBLOB 的4GB最大值(甚至 MEDIUMBLOB 的16MB最大值)。在这些情况下,服务器现在会将这些列类型转换为紧凑型附加组件,就像它对待 TINYBLOBBLOB 列一样,在测试中显示出显着的性能提高。在这方面,MEDIUMBLOBLONGBLOB 列的处理保持不变。

这项增强的一个影响是,如果排序缓冲区的大小不足,尝试对包含非常大(多兆字节)JSONGEOMETRY 列值的行进行排序时,现在可能会发生 Out of memory 错误;通常可以通过增加 sort_buffer_size 系统变量的值来补偿这个问题。

依我看,这是我们在 MySQL 中使用 JSON 时应该谨慎和犹豫的许多原因之一。在可能的情况下,请使用普通的行和列。

英文:

It's sort of a bug. It's the consequence of a supposed improvement in MySQL 8.0.20. See https://bugs.mysql.com/bug.php?id=103225

The best workaround is to make sure your query optimizes the sort, by creating an index on your created_at column. If the query reads the rows in index order which is the same order you want them, then it won't need to use a sort buffer at all.

If you can't create that index, MySQL must read the rows in some other order (probably primary key order), and must sort the rows in batches. These batches are stored in a sort buffer in memory, and the sort buffer must be large enough to hold several rows. The default sort buffer size is quite modest, as you found it is 256KB.

You said you tried to increase it, but if your JSON documents are large, you would have to increase the sort buffer size a lot. I'd guess 10-20x the largest JSON document you store in that table.

You can check how large your largest JSON document is (at least of the documents currently stored in your table):

SELECT MAX(JSON_STORAGE_SIZE(model)) FROM campaigns; 

Here is an explanation from the MySQL release notes about this change:

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-20.html says:

> Previously, including any column of a blob type larger than TINYBLOB or BLOB as the payload in an ordering operation caused the server to revert to sorting row IDs only, rather than complete rows; this resulted in a second pass to fetch the rows themselves from disk after the sort was completed. Since JSON and GEOMETRY columns are implemented internally as LONGBLOB, this caused the same behavior with these types of columns even though they are almost always much shorter than the 4GB maximum for LONGBLOB (or even the 16 MB maximum for MEDIUMBLOB). The server now converts columns of these types into packed addons in such cases, just as it does TINYBLOB and BLOB columns, which in testing showed a significant performance increase. The handling of MEDIUMBLOB and LONGBLOB columns in this regard remains unchanged.
>
> One effect of this enhancement is that it is now possible for Out of memory errors to occur when trying to sort rows containing very large (multi-megabtye) JSON or GEOMETRY column values if the sort buffers are of insufficient size; this can be compensated for in the usual fashion by increasing the value of the sort_buffer_size system variable.

In my opinion, this is one of many reasons why we should be cautious and hesitant to use JSON in MySQL. Use normal rows and columns wherever possible.

huangapple
  • 本文由 发表于 2023年6月26日 22:11:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76557503.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定