在非索引列上使用“冲突仲裁索引”时,PostgreSQL 的 UPSERT 操作较慢。

huangapple go评论64阅读模式
英文:

Slow PostgreSQL UPSERT with "Conflict Arbiter Indexes" on non indexed column

问题

以下是查询的中文翻译:

这个查询在执行时需要很长时间,我相信它正在创建一个瓶颈。它尝试进行一次UPSERT操作,更新的列没有索引,但有检查约束:

解释分析提供了以下输出:

插入到public.limit_counters表中(成本=0.09..0.14,行数=2,宽度=120)(实际时间=3566.376..3566.378,行数=0,循环次数=1)
冲突解决:更新
冲突仲裁器索引:counters_unique_cmp_id
插入的元组数:0
冲突的元组数:2
缓冲区:共享命中=258
CTE数据
-> 值扫描(VALUES)(成本=0.00..0.03,行数=2,宽度=68)(实际时间=0.001..0.002,行数=2,循环次数=1)
输出:VALUES.column1,VALUES.column2,VALUES.column3
-> 子查询扫描(SELECT)(成本=0.06..0.11,行数=2,宽度=120)(实际时间=0.031..0.043,行数=2,循环次数=1)
输出:nextval('counters_id_seq'::regclass),SELECT.action,SELECT.cmp_id,NULL::bigint,NULL::bigint,NULL::bigint,SELECT.greatest,SELECT."?column?",NULL::text,NULL::integer
缓冲区:共享命中=2
-> 排序(成本=0.06..0.07,行数=2,宽度=100)(实际时间=0.020..0.021,行数=2,循环次数=1)
输出:((data.cmp_id)::numeric),data.action,(GREATEST('0'::numeric,data.delta)),0
排序键:((data.cmpid)::numeric),data.action
排序方法:quicksort 内存:25kB
-> CTE数据扫描(成本=0.00..0.05,行数=2,宽度=100)(实际时间=0.006..0.007,行数=2,循环次数=1)
输出:(data.cmp_id)::numeric,data.action,GREATEST('0'::numeric,data.delta),0
子计划2
-> 限制(成本=0.00..0.06,行数=1,宽度=8)(实际时间=0.006..0.006,行数=1,循环次数=2)
输出:((data_1.delta)::double precision)
-> CTE数据扫描data_1(成本=0.00..0.06,行数=1,宽度=8)(实际时间=0.005..0.005,行数=1,循环次数=2)
输出:(data_1.delta)::double precision
过滤器:((data_1.action = excluded.action) AND ((data_1.cmp_id)::bigint = excluded.cmp_id))
过滤后的行数:0
规划:
缓冲区:共享命中=22
规划时间:0.227毫秒
执行时间:3566.434毫秒

这个表非常小(~250行),数据库非常强大。

使用情况是对同一行进行多次更新,但不超过每秒约150次,每次更新都在事务上下文中进行,但我尽量在事务结束时执行,以最小化锁定持续时间。

表上有多个索引,但在更新的列上没有任何索引。
更新的counter列上有两个检查约束,它是一个float8类型:

约束条件counters_overflow CHECK (((limitval = (0)::double precision) OR (counter <= (limitval + (0.01)::double precision)))),
约束条件counters_underflow CHECK ((counter >= ('-0.01'::numeric)::double precision))

我想象中由于有许多并发更新锁定了相同的行,所以会出现瓶颈。但我没有预料到瓶颈会如此狭窄 - 因为理论上更新应该是热更新,我会很快提交事务。

我想过检查约束或者UPDATE在ON CONFLICT子句中的事实是否导致更新变慢?

我心中的一个可能解决方案是改变逻辑,首先尝试更新,如果不存在再尝试插入。或者可能改变表的填充因子?但我想这只适用于更新不是热更新的情况。

检查热更新时,我执行:

SELECT relname, n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd
FROM pg_stat_all_tables
WHERE relname = 'counters';

并获得以下输出:

表名       插入行数   更新行数   删除行数  热更新行数
counters   129      39824183  126     39388687
英文:

The following query takes a long time to execute and I believe it is creating a bottleneck. It tries to UPSERT a row, where the update is on a column with no indexes but with check constraints:

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
WITH data (cmp_id, action,delta) 
AS (
	VALUES(42,&#39;action1&#39;,1),(42,&#39;action2&#39;,0.00081991800)
) 
INSERT INTO counters (cmp_id, action, counter, limitval)
SELECT
	data.cmp_id::numeric,
	data.action,
	GREATEST(0, data.delta::numeric),
	0
FROM
	data
ORDER BY
	cmp_id, action 
ON CONFLICT (cmp_id, action)
WHERE (cou_id IS NULL AND ref_id IS NULL AND pro_id IS NULL AND identifier IS NULL)
DO
UPDATE
SET
	counter = counters.counter + (
		SELECT
			data.delta::float
		FROM
			data
		WHERE
			data.cmp_id::bigint = excluded.cmp_id::bigint
			AND data.action::text = excluded.action::text
		LIMIT 1);

Explain analyze gives the following output:

Insert on public.limit_counters  (cost=0.09..0.14 rows=2 width=120) (actual time=3566.376..3566.378 rows=0 loops=1)
  Conflict Resolution: UPDATE
  Conflict Arbiter Indexes: counters_unique_cmp_id
  Tuples Inserted: 0
  Conflicting Tuples: 2
  Buffers: shared hit=258
  CTE data
&quot;    -&gt;  Values Scan on &quot;&quot;*VALUES*&quot;&quot;  (cost=0.00..0.03 rows=2 width=68) (actual time=0.001..0.002 rows=2 loops=1)&quot;
&quot;          Output: &quot;&quot;*VALUES*&quot;&quot;.column1, &quot;&quot;*VALUES*&quot;&quot;.column2, &quot;&quot;*VALUES*&quot;&quot;.column3&quot;
&quot;  -&gt;  Subquery Scan on &quot;&quot;*SELECT*&quot;&quot;  (cost=0.06..0.11 rows=2 width=120) (actual time=0.031..0.043 rows=2 loops=1)&quot;
&quot;        Output: nextval(&#39;counters_id_seq&#39;::regclass), &quot;&quot;*SELECT*&quot;&quot;.action, &quot;&quot;*SELECT*&quot;&quot;.cmp_id, NULL::bigint, NULL::bigint, NULL::bigint, &quot;&quot;*SELECT*&quot;&quot;.&quot;&quot;greatest&quot;&quot;, &quot;&quot;*SELECT*&quot;&quot;.&quot;&quot;?column?&quot;&quot;, NULL::text, NULL::integer&quot;
        Buffers: shared hit=2
        -&gt;  Sort  (cost=0.06..0.07 rows=2 width=100) (actual time=0.020..0.021 rows=2 loops=1)
&quot;              Output: ((data.cmp_id)::numeric), data.action, (GREATEST(&#39;0&#39;::numeric, data.delta)), 0&quot;
              Sort Key: ((data.cmpid)::numeric), data.action
              Sort Method: quicksort  Memory: 25kB
              -&gt;  CTE Scan on data  (cost=0.00..0.05 rows=2 width=100) (actual time=0.006..0.007 rows=2 loops=1)
&quot;                    Output: (data.cmp_id)::numeric, data.action, GREATEST(&#39;0&#39;::numeric, data.delta), 0&quot;
  SubPlan 2
    -&gt;  Limit  (cost=0.00..0.06 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
          Output: ((data_1.delta)::double precision)
          -&gt;  CTE Scan on data data_1  (cost=0.00..0.06 rows=1 width=8) (actual time=0.005..0.005 rows=1 loops=2)
                Output: (data_1.delta)::double precision
                Filter: ((data_1.action = excluded.action) AND ((data_1.cmp_id)::bigint = excluded.cmp_id))
                Rows Removed by Filter: 0
Planning:
  Buffers: shared hit=22
Planning Time: 0.227 ms
Execution Time: 3566.434 ms

The table is very small (~250 rows) and the database is quite powerful.

The use case is multiple updates to same rows, but not more than ~150 per second, where each update happens in the context of a transaction, but I try to perform it as close to the transaction end as possible, in order to minimize the locking duration.

There are multiple indexes on the table, but none on the updated column.
There are two check constraint on the updated counter columns which is a float8 type:

    CONSTRAINT counters_overflow CHECK (((limitval = (0)::double precision) OR (counter &lt;= (limitval + (0.01)::double precision)))),
    CONSTRAINT counters_underflow CHECK ((counter &gt;= (&#39;-0.01&#39;::numeric)::double precision))

I imagine that since there are many concurrent updates that lock the same row that creates a bottleneck. But I didn't expect the bottleneck to be so narrow - since the updates should in theory be hot updates and I would commit the transaction shortly afterwards.

One thought I've had is that the check constraint or the fact that the UPDATE is inside an ON CONFLICT clause is causing the update be slower?

One possible solution I've had in mind is to change the logic to first try to update and then if not exists try to insert instead. Or maybe change the fillfactor of the table? But I guess that only applies if the updates are not hot.


Checking for hot updates I execute:

SELECT relname, n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd
FROM pg_stat_all_tables
WHERE relname = &#39;counters&#39;;

And get the following output:

table_name  n_tup_ins   n_tup_ups   n_tup_del  n_tup_hot_upd
counters	129	        39824183	126	       39388687

答案1

得分: 1

所有的时间都花在实际数据修改上。长时间的持续可能暗示着锁定。我会设置log_lock_waits = on,然后您将在有人需要等待一个锁定超过一秒钟时收到日志消息。

英文:

All the time is spent on the actual data modification. The long duration would suggest a lock. I'd set log_lock_waits = on, then you will get a log message whenever somebody has to wait for a lock for more than one second.

huangapple
  • 本文由 发表于 2023年7月20日 17:52:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76728663.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定