关于PostgreSQL临时文件的误解

huangapple go评论63阅读模式
英文:

Misunderstanding about postgresql temp files

问题

PostgreSQL 13.7

我在所有PostgreSQL服务器上的数据库postgres中都有pg_stat_statements扩展。

Pmm-agent每分钟对此扩展发出一次请求:

SELECT /* pmm-agent:pgstatstatements */ pg_stat_statements.userid, pg_stat_statements.dbid, pg_stat_statements.queryid, pg_stat_statements.query, pg_stat_statements.calls, pg_stat_statements.total_exec_time, pg_stat_statements.rows, pg_stat_statements.shared_blks_hit, pg_stat_statements.shared_blks_read, pg_stat_statements.shared_blks_dirtied, pg_stat_statements.shared_blks_written, pg_stat_statements.local_blks_hit, pg_stat_statements.local_blks_read, pg_stat_statements.local_blks_dirtied, pg_stat_statements.local_blks_written, pg_stat_statements.temp_blks_read, pg_stat_statements.temp_blks_written, pg_stat_statements.blk_read_time, pg_stat_statements.blk_write_time FROM pg_stat_statements WHERE queryid IS NOT NULL AND query IS NOT NULL

在我的一台服务器上,这个请求导致了临时文件的使用。

EXPLAIN (ANALYZE, BUFFERS, SETTINGS)
"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=1068.291..1207.841 rows=3401 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"  Buffers: temp read=30382 written=30382"
"Settings: effective_cache_size = '24GB', effective_io_concurrency = '200', max_parallel_workers = '12', max_parallel_workers_per_gather = '6', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '40MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.153 ms"
"Execution Time: 1244.587 ms"

关于PostgreSQL临时文件的误解

在另一台服务器上,此查询的执行计划是正常的,例如:

"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=0.939..1.064 rows=364 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"Settings: cpu_index_tuple_cost = '0.0005', effective_cache_size = '16GB', effective_io_concurrency = '200', max_parallel_workers = '10', max_parallel_workers_per_gather = '6', parallel_tuple_cost = '0.05', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '153MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.140 ms"
"Execution Time: 2.224 ms"

查询中没有排序或连接操作,为什么会使用临时文件?

将work_mem增加到300MB并没有解决问题。

英文:

PostgreSQL 13.7

I have pg_stat_statements extension in database postgres on all postgresql servers.

Pmm-agent makes a request to this extension every minute:

SELECT /* pmm-agent:pgstatstatements */ pg_stat_statements.userid, pg_stat_statements.dbid, pg_stat_statements.queryid, pg_stat_statements.query, pg_stat_statements.calls, pg_stat_statements.total_exec_time, pg_stat_statements.rows, pg_stat_statements.shared_blks_hit, pg_stat_statements.shared_blks_read, pg_stat_statements.shared_blks_dirtied, pg_stat_statements.shared_blks_written, pg_stat_statements.local_blks_hit, pg_stat_statements.local_blks_read, pg_stat_statements.local_blks_dirtied, pg_stat_statements.local_blks_written, pg_stat_statements.temp_blks_read, pg_stat_statements.temp_blks_written, pg_stat_statements.blk_read_time, pg_stat_statements.blk_write_time FROM pg_stat_statements WHERE queryid IS NOT NULL AND query IS NOT NULL

On one of my servers this request is causing temporary files usage.

EXPLAIN (ANALYZE, BUFFERS, SETTINGS)
"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=1068.291..1207.841 rows=3401 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"  Buffers: temp read=30382 written=30382"
"Settings: effective_cache_size = '24GB', effective_io_concurrency = '200', max_parallel_workers = '12', max_parallel_workers_per_gather = '6', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '40MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.153 ms"
"Execution Time: 1244.587 ms"

关于PostgreSQL临时文件的误解

On another servers execution_plan for this query is normal, for example:

"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=0.939..1.064 rows=364 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"Settings: cpu_index_tuple_cost = '0.0005', effective_cache_size = '16GB', effective_io_concurrency = '200', max_parallel_workers = '10', max_parallel_workers_per_gather = '6', parallel_tuple_cost = '0.05', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '153MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.140 ms"
"Execution Time: 2.224 ms"

There are no sorts or joins in the query, why are temporary files used?

Increasing work_mem to 300MB did not solve the problem.

答案1

得分: 1

从执行计划中可以看出,这个视图是使用一个函数定义的。现在,PostgreSQL会将函数的结果收集到一个名为“tuple store”的数据结构中。只要tuple store的大小不超过work_mem,它就会保留在内存中。如果超过了这个大小,它就会被写入临时文件中。

所以,要么在第一种情况下有更多的数据(pg_stat_statements.max的值较高,或者语句的文本较多),要么在第二种情况下work_mem较小。

英文:

You see from the execution plan that this view is defined with a function. Now PostgreSQL collects the function results in a data structure called “tuple store”. As long as the tuple store fits in work_mem, it is kept in RAM. If it exceeds that size, it is spooled to a temporary file.

So either there are more data in the first case (pg_stat_statements.max has a higher value, or the statements have more text), or work_mem is smaller in the second case.

huangapple
  • 本文由 发表于 2023年6月15日 15:54:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76480269.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定