英文:
How to compress data when using psql's copy statement?
问题
我的目标是将大量数据高效地导入到Postgres数据库中。原则上,原始数据可以通过约20倍的压缩(例如使用gzip)。
COPY
语句似乎是批量导入的最佳选项。
除了sslcompression(在数据加密后应用)之外,是否有一种方法可以压缩客户端和服务器之间传输的实际数据(内容),或者这甚至是默认内置的?
非常感谢。
(不应该影响,但我正在使用golang)。
英文:
My goal is to efficiently import large amounts of data into a Postgres Database. In principle, the raw data could be compressed by a factor of ~20 (e.g. using gzip).
The COPY
statement seems to be the best option for a bulk import.
Apart from sslcompression (which is applied after the data is encrypted), is there a way to compress the actual data (content) transferred between client and server, or is that even built-in by default?
Many thanks.
(Should not matter, but I am using golang).
答案1
得分: 5
COPY [TO|FROM] PROGRAM
允许使用 gzip 作为程序,如果它安装在可访问 PostgreSQL 服务器进程的地方:https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.10。
您也可以使用 COPY TO|FROM STDIN
并在客户端进行解压缩或压缩。
英文:
the COPY [TO|FROM] PROGRAM
allows to use gzip as a program, if it is installed somewhere accessible to the postgres server process: https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.10.
You could also use COPY TO|FROM STDIN
and do the de/compression client-side.
答案2
得分: 1
如果您的瓶颈是网络吞吐量,您将希望使用类似scp或ssh的方法将压缩数据发送(复制或流式传输)到数据库机器,然后在与数据库服务器运行的同一台机器上运行COPY。这可以通过多种方式进行编排,但它们都有一个除了libpq/PostgreSQL之外的东西作为乐团的指挥者。
英文:
If your bottleneck is network throughput, you will want to send (copy or stream) compressed data to the database machine using something like scp or ssh, then run COPY in a client on the same machine as the database server is running. There are a number of ways this can be orchestrated, but all of them have something other than libpq/PostgreSQL as the conductor of the orchestra.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论