英文:
multiple orders slower when adding additional OR to WHERE condition
问题
在PostgreSQL 14数据库中,我有一个包含约4亿条记录的transactions表,并且有以下查询:
SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1)
ORDER BY "transactions"."id" DESC
LIMIT 10
这个查询运行得很快。但是,如果我在WHERE子句中加入第二个wallet id,当两个wallet都只有少量交易时,查询会花费几分钟的时间:
SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1 OR "wallet_id" = $2)
ORDER BY "transactions"."id" DESC
LIMIT 10
经过多次尝试后,似乎问题出在ORDER BY
与LIMIT
结合使用时,如果我移除其中一个,查询就能很快完成。另外,如果两个wallet中至少有一个有较多的交易记录,查询也能快速完成。表中的id
是主键,并且wallet_id
上有索引。
我已经尝试了多种方法,包括运行VACUUM
(以防万一)和在表上运行ANALYZE
,但都没有效果。请问有什么方法可以使得这个查询在所有情况下都更快吗?
英文:
In a PostgreSQL 14 database I have a transactions table currently containing around 400 million elements and the following query on it:
SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1)
ORDER BY "transactions"."id" DESC
LIMIT 10
This works fine and the query is fast. The EXPLAIN ANALYZE
output:
Limit (cost=84991.29..84991.31 rows=10 width=146) (actual time=1.518..1.519 rows=2 loops=1)
-> Sort (cost=84991.29..85056.88 rows=26235 width=146) (actual time=1.517..1.518 rows=2 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 25kB
-> Index Scan using transactions_wallet_id_index on transactions (cost=0.57..84424.36 rows=26235 width=146) (actual time=1.080..1.497 rows=2 loops=1)
Index Cond: (wallet_id = $1)
Planning Time: 0.850 ms
Execution Time: 1.682 ms
If I add a second wallet id to the where, then the query takes several minutes when both wallets have a small number of transactions (2 for the first one, 57 for the second one):
SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1 OR "wallet_id" = $2)
ORDER BY "transactions"."id" DESC
LIMIT 10
Its analysis:
Limit (cost=0.57..117334.73 rows=10 width=146) (actual time=3683.524..944867.724 rows=10 loops=1)
-> Index Scan Backward using transactions_pkey on transactions (cost=0.57..615664040.26 rows=52471 width=146) (actual time=3683.523..944867.715 rows=10 loops=1)
Filter: ((wallet_id = $1) OR (wallet_id = $2))
Rows Removed by Filter: 117937409
Planning Time: 0.810 ms
Execution Time: 944867.797 ms
After investigating this for hours, it seems the issue comes from using ORDER BY
in combination with LIMIT
, and indeed if I remove one of the two, the query runs fast. If at least one of the wallets has a non-small number of transactions, it also runs fast. id
is the primary key and there's an index on wallet_id
.
This is very disappointing and frustrating. It's my first time using Postgres and the fact that the query planner does such a poor job on such a simple query is honestly hard to understand.
I'd appreciate some suggestions on how to make the query faster for all cases.
I tried different things for several hours now, including running VACUUM
(just in case) and ANALYZE
on the table, but to no avail.
答案1
得分: 4
如果从查询规划器的角度来看这个问题,它有两个重要的操作:
- 过滤数据
- 排序数据
对于第一点,你创建的索引 transactions_wallet_id_index 是首选的。对于第二点,与主键一起的索引更好(反向扫描它更好)。请注意,你的最佳查询在数据被过滤后有一个逻辑的排序操作,而使用 OR
的查询没有,它只有一个限制。
我创建了一个包含200万行的表来重新创建你的情景。
select wallet_id, count(*) from transactions
group by wallet_id
order by wallet_id
wallet_id | count(*) |
---|---|
0 | 2 |
1 | 12 |
2 | 48 |
3 | 192 |
4 | 768 |
5 | 3072 |
6 | 12288 |
7 | 49152 |
8 | 196608 |
9 | 786432 |
10 | 951426 |
现在,如果我们选择一个非常小的钱包,比如2和3,它看起来就像你期望的那样,是两个条件的位图或:
explain analyze select id from transactions where wallet_id = 2 or wallet_id = 3 order by id desc limit 10
但在某些情况下,我们会看到你所看到的情况,比如钱包3和4:
explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by id desc limit 10
查询规划器认为对这个集合进行排序会比过滤更昂贵。
在这个查询中没有真正的绕过方法(至少我不知道有)。使用 OR
查询在统计方面确实很混乱,而且不仅仅是Postgres受到影响,相信我。
你可以尝试运行 ANALYZE transactions
,但机会是你的统计信息是最新的。扩展统计信息在这里也不会帮助你。你可以尝试调整一些性能设置(特别是 work_mem
和可能的 random_page_cost
),以强制正确的计划,但这会影响其他查询。
一种确保优化的方法是自己做。正如我所说,使用 OR
查询会有问题。尝试为你的查询创建一个支持的索引,然后我们可以尝试一些选项。
create index ux1 on transactions (wallet_id, id desc)
这可能会让习惯于使用SQL Server的人感到惊讶,但你可以通过使用 IN
(在SQL Server中,IN
被转化为一堆 OR
)来获得Postgres的不同计划:
explain analyze select id from transactions where wallet_id in (3, 4) order by id desc limit 10
你会得到一个很好的计划:
Limit (cost=55.96..55.99 rows=10 width=4) (actual time=0.250..0.252 rows=10 loops=1)
-> Sort (cost=55.96..58.46 rows=1000 width=4) (actual time=0.249..0.250 rows=10 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Index Only Scan using ux1 on transactions (cost=0.43..34.36 rows=1000 width=4) (actual time=0.022..0.144 rows=960 loops=1)
Index Cond: (wallet_id = ANY ('{3,4}'::integer[]))
Heap Fetches: 0
Planning Time: 0.092 ms
Execution Time: 0.268 ms
你也可以尝试通过告诉规划器 "这些不是你要找的ID" 来操纵规划器的决策(或者有一个索引):
explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by 0+id desc limit 10
注意 0+id - 1*id,id+id 也可以工作,任何使其看起来与现有的索引/主键不同的东西:
Limit (cost=3027.11..3027.13 rows=10 width=8) (actual time=0.332..0.334 rows=10 loops=1)
-> Sort (cost=3027.11..3029.61 rows=1000 width=8) (actual time=0.331..0.332 rows=10 loops=1)
Sort Key: ((1 * id)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on transactions (cost=16.86..3005.50 rows=1000 width=8) (actual time=0.047..0.171 rows=960 loops=1)
Recheck Cond: ((wallet_id = 3) OR (wallet_id = 4))
Heap Blocks: exact=7
-> BitmapOr (cost=16.86..16.86 rows=1000 width=0) (actual time=0.037..0.038 rows=0 loops=1)
-> Bitmap Index Scan on ux1 (cost=0.00..5.93 rows=200 width=0) (actual time=0.016..0.016 rows=192 loops=1)
Index Cond: (wallet_id = 3)
-> Bitmap Index Scan on ux (cost=0.00..10.43 rows=800 width=0) (actual time=0.021..0.021 rows=768 loops=1)
Index Cond: (wallet_id = 4)
Planning Time: 0.094 ms
Execution Time: 0.355 ms
在某些情况下,你还可以将查询拆分为两个部分,然后将它们合并在一起:
explain analyze select id from (select id from transactions where wallet_id = 3 union all select id
<details>
<summary>英文:</summary>
If you look at this issue from query planner's perspective, it has two significant operations to do here:
1. filter the data
2. sort the data
For 1., the index you have created, transactions_wallet_id_index, is preferable. For 2., the index that comes with the primary key is better (well, scanning it backwards is). Note how your optimal query has a logical Sort operation after the data is filtered out, while the one for `OR` does not, it just has a limit.
I created a 2 mil table to recreate your scenario.
select wallet_id, count(*) from transactions
group by wallet_id
order by wallet_id
| wallet_id | count(*) |
|-----------|----------|
| 0 | 2 |
| 1 | 12 |
| 2 | 48 |
| 3 | 192 |
| 4 | 768 |
| 5 | 3072 |
| 6 | 12288 |
| 7 | 49152 |
| 8 | 196608 |
| 9 | 786432 |
| 10 | 951426 |
Now if we select for a very small wallet, like say 2 and 3, it'll look like what you're expecting, a bitmap or of the two conditions:
explain analyze select id from transactions where wallet_id = 2 or wallet_id = 3 order by id desc limit 10
Limit (cost=502.54..502.57 rows=10 width=4) (actual time=0.101..0.104 rows=10 loops=1)
-> Sort (cost=502.54..502.87 rows=133 width=4) (actual time=0.100..0.102 rows=10 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on transactions (cost=9.93..499.67 rows=133 width=4) (actual time=0.033..0.060 rows=240 loops=1)
Recheck Cond: ((wallet_id = 2) OR (wallet_id = 3))
Heap Blocks: exact=3
-> BitmapOr (cost=9.93..9.93 rows=133 width=0) (actual time=0.024..0.024 rows=0 loops=1)
-> Bitmap Index Scan on ux1 (cost=0.00..4.44 rows=1 width=0) (actual time=0.014..0.015 rows=48 loops=1)
Index Cond: (wallet_id = 2)
-> Bitmap Index Scan on ux1 (cost=0.00..5.42 rows=133 width=0) (actual time=0.009..0.009 rows=192 loops=1)
Index Cond: (wallet_id = 3)
Planning Time: 0.097 ms
Execution Time: 0.125 ms
But at some point, we see what you're seeing, wallets 3 & 4:
explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by id desc limit 10
Limit (cost=0.43..728.01 rows=10 width=4) (actual time=171.911..171.915 rows=10 loops=1)
-> Index Scan Backward using transactions_pkey on transactions (cost=0.43..72758.43 rows=1000 width=4) (actual time=171.910..171.912 rows=10 loops=1)
Filter: ((wallet_id = 3) OR (wallet_id = 4))
Rows Removed by Filter: 999489
Planning Time: 0.095 ms
Execution Time: 171.932 ms
The query planner just thinks sorting that set is going to be more expensive than filtering it.
There's not really a way around it given this query (that I would know of), `OR` queries are just messy when it comes to statistics, and it's definitely not just Postgres that suffers from it, believe me.
You can try `ANALYZE transactions`, but chances are your stats are up to date. Extended statistics won't help you here either. You could possibly tweak some of the performance settings (especially `work_mem` and maybe `random_page_cost`) to force the correct plan here, but you run into messing up other queries.
One sure way to optimize this is to do it yourself. As I said, `OR` queries are problematic. Try creating a supporting index for your query and we can try a couple options.
create index ux1 on transactions (wallet_id, id desc)
This may come as a surprise to someone used to SQL Server for example, but you can get a different plan with Postgres by using `IN` (in SQL Server `IN` gets translated to a bunch of `ORs`:
explain analyze select id from transactions where wallet_id in (3, 4) order by id desc limit 10
And you get a nice:
Limit (cost=55.96..55.99 rows=10 width=4) (actual time=0.250..0.252 rows=10 loops=1)
-> Sort (cost=55.96..58.46 rows=1000 width=4) (actual time=0.249..0.250 rows=10 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Index Only Scan using ux1 on transactions (cost=0.43..34.36 rows=1000 width=4) (actual time=0.022..0.144 rows=960 loops=1)
Index Cond: (wallet_id = ANY ('{3,4}'::integer[]))
Heap Fetches: 0
Planning Time: 0.092 ms
Execution Time: 0.268 ms
You can also try messing with the planner's head by telling "these are not the IDs you're looking for" (or have an index for):
explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by 0+id desc limit 10
(note 0+id - 1*id, id+id would also work, anything that makes it look different from what the existing index / pk is on):
Limit (cost=3027.11..3027.13 rows=10 width=8) (actual time=0.332..0.334 rows=10 loops=1)
-> Sort (cost=3027.11..3029.61 rows=1000 width=8) (actual time=0.331..0.332 rows=10 loops=1)
Sort Key: ((1 * id)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on transactions (cost=16.86..3005.50 rows=1000 width=8) (actual time=0.047..0.171 rows=960 loops=1)
Recheck Cond: ((wallet_id = 3) OR (wallet_id = 4))
Heap Blocks: exact=7
-> BitmapOr (cost=16.86..16.86 rows=1000 width=0) (actual time=0.037..0.038 rows=0 loops=1)
-> Bitmap Index Scan on ux1 (cost=0.00..5.93 rows=200 width=0) (actual time=0.016..0.016 rows=192 loops=1)
Index Cond: (wallet_id = 3)
-> Bitmap Index Scan on ux (cost=0.00..10.43 rows=800 width=0) (actual time=0.021..0.021 rows=768 loops=1)
Index Cond: (wallet_id = 4)
Planning Time: 0.094 ms
Execution Time: 0.355 ms
Under some circumstances, you could also get away with just splitting the query into two and unioning them together:
explain analyze select id from (select id from transactions where wallet_id = 3 union all select id from transactions where wallet_id = 4) q order by id desc limit 10
Limit (cost=70.96..70.99 rows=10 width=4) (actual time=0.539..0.542 rows=10 loops=1)
-> Sort (cost=70.96..73.46 rows=1000 width=4) (actual time=0.538..0.540 rows=10 loops=1)
Sort Key: transactions.id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Append (cost=0.43..49.35 rows=1000 width=4) (actual time=0.030..0.349 rows=960 loops=1)
-> Index Only Scan using ux1 on transactions (cost=0.43..7.93 rows=200 width=4) (actual time=0.029..0.065 rows=192 loops=1)
Index Cond: (wallet_id = 3)
Heap Fetches: 0
-> Index Only Scan using ux1 on transactions transactions_1 (cost=0.43..26.43 rows=800 width=4) (actual time=0.012..0.179 rows=768 loops=1)
Index Cond: (wallet_id = 4)
Heap Fetches: 0
Planning Time: 0.176 ms
Execution Time: 0.569 ms
(this might not work very well for bigger wallets though)
You have plenty of options, `IN` should work best, but you might want to try them all to see which produces the best results.
</details>
# 答案2
**得分**: 3
The additional filter makes Postgres switch to a different query plan, that turns out to be extremely inefficient. 这个额外的过滤器导致Postgres切换到一个不高效的查询计划,结果非常低效。
The effect is triggered by chance, more or less. 这种效果或多或少地是由机会触发的。
The underlying problem is that Postgres grossly under-estimates the selectivity of the `WHERE` clause. 根本问题在于Postgres严重低估了`WHERE`子句的选择性。
It expects that it can just walk the index on the PK (`transactions_pkey`) in the requested sort order and will find the few rows (`LIMIT 10`) soon enough. 它期望只需按照请求的排序顺序遍历主键上的索引(`transactions_pkey`),很快就能找到少数行(`LIMIT 10`)。
Turns out, the filter is extremely selective and Postgres has to skip over 118 million rows (!!!) before it finds enough matches (`Rows Removed by Filter: 117937409`). 结果,这个过滤器非常有选择性,Postgres必须跳过1.18亿行才能找到足够的匹配项(过滤器删除的行数:117937409)。
In cases where there are not enough rows to satisfy the limit, **all** rows have to be visited before Postgres can finally give up. The worst case scenario. 在没有足够的行来满足限制的情况下,Postgres必须在最终放弃之前访问**所有**行。这是最坏的情况。
The decision was made based on **wrong or misleading column statistics**. If you can improve column statistics, that might fix the problem by itself. 决策是基于**错误或误导性的列统计数据**做出的。如果你能改善列统计数据,这可能会自行解决问题。
*May* be as simple as `ANALYZE transactions;`. There are various ways. 这可能就像`ANALYZE transactions;`一样简单。有各种各样的方法。
For your particular case, there is a different approach, too. 对于你的特殊情况,也有不同的方法。
To get the best performance possible, create a multicolumn index (once): 为了获得尽可能好的性能,创建一个多列索引(仅一次):
CREATE INDEX transactions_wallet_id_id_idx ON transactions (wallet_id, id DESC); 在transactions表上创建transactions_wallet_id_id_idx索引(wallet_id, id DESC)。
Increases performance for your simple query a lot, too. (Even if that's pretty fast already.) 也大大提高了你的简单查询的性能。(即使已经相当快了。)
You'll see an index-only scan (or at least an index scan) without "Sort" step. 你将看到一个索引扫描(或者至少是一个索引扫描),没有“排序”步骤。
Then use this query: 然后使用这个查询:
(SELECT id
FROM transactions
WHERE wallet_id = $1
ORDER BY id DESC
LIMIT 10
)
UNION ALL
(
SELECT id
FROM transactions
WHERE wallet_id = $2
ORDER BY id DESC
LIMIT 10
)
ORDER BY id DESC
LIMIT 10;
现在第二个查询的成本只是第一个的两倍左右,即***非常***快。
<details>
<summary>英文:</summary>
### Why?
The additional filter makes Postgres switch to a different query plan, that turns out to be extremely inefficient. The effect is triggered by chance, more or less. The underlying problem is that Postgres grossly under-estimates the selectivity of the `WHERE` clause. It expects that it can just walk the index on the PK (`transactions_pkey`) in the requested sort order and will find the few rows (`LIMIT 10`) soon enough. Turns out, the filter is extremely selective and Postgres has to skip over 118 million rows (!!!) before it finds enough matches (`Rows Removed by Filter: 117937409`). In cases where there are not enough rows to satisfy the limit, **all** rows have to be visited before Postgres can finally give up. The worst case scenario.
The decision was made based on **wrong or misleading column statistics**. If you can improve column statistics, that might fix the problem by itself. *May* be as simple as `ANALYZE transactions;`. There are various ways. Here is a related answer from just yesterday on dba.SE, with more on that (and links to more similar cases):
- [Simple query with two WHERE clauses uses inferior index scan][1]
Just so happens that I discussed the very same "brute force" hack with `ORDER BY id + 0`, which you found for your answer.
### Solution
For your particular case, there is a different approach, too. To get the best performance possible, create a multicolumn index (once):
CREATE INDEX transactions_wallet_id_id_idx ON transactions (wallet_id, id DESC);
Increases performance for your simple query a lot, too. (Even if that's pretty fast already.) You'll see an index-only scan (or at least an index scan) without "Sort" step.
Then use this query:
~~~pgsql
(
SELECT id
FROM transactions
WHERE wallet_id = $1
ORDER BY id DESC
LIMIT 10
)
UNION ALL
(
SELECT id
FROM transactions
WHERE wallet_id = $2
ORDER BY id DESC
LIMIT 10
)
ORDER BY id DESC
LIMIT 10;
~~~
All parentheses required.
Now the second query is just about twice the cost of the first, i.e. ***very*** fast.
[1]: https://dba.stackexchange.com/a/327316/3684
</details>
# 答案3
**得分**: 0
经过更深入的研究,我终于通过[这篇博客文章][1]解决了这个问题。这只适用于`ORDER BY`列是整数(或者可能只是数字),通过添加`+ 0`来解决。我发帖是为了帮助其他人。
我还发现了一些与这个问题类似或相关的在PostgreSQL中报告的问题,并且我觉得一些核心开发人员对于这类实际使用问题的忽视令人震惊。显然,他们认为通过在整数字段的`ORDER BY`上添加`+ 0`来使查询在0.3秒内运行是可以接受的设计决定,否则由于规划器中的某些错误的提前中止功能,查询可能需要20分钟。
这让我严重重新考虑了我们迁移到Postgres的决定,我们已经在审查其他替代方案。
[1]: https://ottertune.com/blog/how-to-fix-slow-postgresql-queries/
<details>
<summary>英文:</summary>
After some more digging, I've finally solved the issue myself thanks to [this blog post][1]. It only applies if the `ORDER BY` column is an integer (or maybe just a number), by adding `+ 0` to it. Posting it in case it helps someone.
I also found several issues reported to PostgreSQL about problems like or related to this one, and I find the disregard for some real-world usage issues like these from some of the core developers incredibly appalling. Apparently they find an acceptable design decision that a query that runs in 0.3s by adding a `+ 0` to an `ORDER BY` on an integer field might take 20 minutes otherwise due to some abort-early funcionality in the planner gone wrong.
This has made me seriously reconsider our decision to migrate to Postgres, and we are already reviewing alternatives. Really sad.
[1]: https://ottertune.com/blog/how-to-fix-slow-postgresql-queries/
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论