SQL查询中的ORDER BY和LIMIT花费了很多时间。

huangapple go评论76阅读模式
英文:

SQL query order by and limit taking a lot of time

问题

我正在查询一个拥有大约4百万数据、包括两种skuID类型的表格,并在版本上使用order by,因为每个skuID平均有大约5千个数据版本,然后使用limit获取最顶部的版本。

查询:

select * FROM table rb 
    WHERE rb.sku_id='' or rb.package_sku_id=''
    order by version desc
limit 1;

通过对查询进行解释,我们发现ORDER BY和LIMIT占据了查询的大部分成本:

Limit  (cost=0.43..5304.64 rows=1 width=861) (actual time=50327.036..50327.041 rows=1 loops=1)
  Buffers: shared hit=361280 read=104302 written=18
  I/O Timings: read=40363.693 write=0.215
  ->  Index Scan Backward using "IDX488yr43nr28a1yml9lb5i7jfv" on referral_benefits rb  (cost=0.43..9552890.48 rows=1801 width=861) (actual time=50327.028..50327.028 rows=1 loops=1)
        Filter: (((sku_id)::text = 'b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5'::text) OR ((package_sku_id)::text = 'b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5'::text))
        Rows Removed by Filter: 1361027
        Buffers: shared hit=361280 read=104302 written=18
        I/O Timings: read=40363.693 write=0.215
Planning Time: 1.121 ms
Execution Time: 50329.843 ms

我们已经在skuId和package_sku_id上添加了索引,但这并没有减少时间。

需要在这方面给予指导。
提前感谢。

英文:

I am querying a table with around 4M data with two types of skuID and using order by on version as there are around avg 5k versions of data for one skuID and fetching the top most versions using limit.

Query:

 select * FROM table rb 
    WHERE rb.sku_id='' or rb.package_sku_id=''
    order by version desc
limit 1;

By using explain on query we found out that ORDER BY and LIMIT is taking most of the cost for the query:

Limit  (cost=0.43..5304.64 rows=1 width=861) (actual time=50327.036..50327.041 rows=1 loops=1)
  Buffers: shared hit=361280 read=104302 written=18
  I/O Timings: read=40363.693 write=0.215
  ->  Index Scan Backward using "IDX488yr43nr28a1yml9lb5i7jfv" on referral_benefits rb  (cost=0.43..9552890.48 rows=1801 width=861) (actual time=50327.028..50327.028 rows=1 loops=1)
        Filter: (((sku_id)::text = 'b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5'::text) OR ((package_sku_id)::text = 'b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5'::text))
        Rows Removed by Filter: 1361027
        Buffers: shared hit=361280 read=104302 written=18
        I/O Timings: read=40363.693 write=0.215
Planning Time: 1.121 ms
Execution Time: 50329.843 ms

We have added indexes on skuId and package_sku_id combined but that is not reducing the time.

Need guidance on this.
Thanks in advance

答案1

得分: 1

如果您有两个索引,一个在(sku_id, version)上,另一个在(package_sku_id, version)上,那么您可以通过以下方式获得非常快的执行速度:

(select * FROM rb WHERE rb.sku_id='b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5' order by version desc) 
    union all
(select * FROM rb WHERE rb.package_sku_id='b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5' order by version desc)
order by version desc limit 1;

但请注意,如果您将LIMIT增加到1之外,那么如果对于同一行同时满足两个SKU条件,此查询可能会返回重复的行。

英文:

If you have two indexes, one on (sku_id, version) and one on (package_sku_id, version), then you could get very fast execution by doing:

(select * FROM rb WHERE rb.sku_id='b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5' order by version desc) 
    union all
(select * FROM rb WHERE rb.package_sku_id='b1d5fa77-fda8-466c-b8b0-a8e60d1f78a5' order by version desc)
order by version desc limit 1;

But note that if you increase the LIMIT beyond 1, then this query might return duplicate rows if both SKU conditions are satisfied for the same row.

答案2

得分: 1

PostgreSQL支持部分索引,即仅适用于表的子集的索引。使用该索引类型,并让它仅包含您想查看的行。然后,DBMS只需读取该索引,数据就会按排序顺序输出:

create index idx on mytable (version desc)
                 where sku_id = '' or package_sku_id = '';

另一个选项是创建一个单独的表,仅包含所需的顶部行。编写触发器以保持该表的实时性。

英文:

PostgreSQL features partial indexes, i.e. indexes that only apply to a subset of the table. Use that index type and have it only contain the rows you want to look at. Then the DBMS only has to read that index and the data comes out sorted:

create index idx on mytable (version desc)
                 where sku_id = '' or package_sku_id = '';

Another option would be a separate table only containing the desired top row. Write a trigger to keep that up-to-date.

huangapple
  • 本文由 发表于 2023年6月5日 20:27:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/76406408.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定