英文:
Logback - compressing log lines before writing to file
问题
我不确定是否可以在日志到达 logback appender 时基本上对日志行的流进行 gzip 压缩,而不是在日志轮转时压缩文件。这是否可能,如果可能的话,如何实现,以及相比整个文件压缩而言,"即时" 压缩是否有很大好处?
英文:
I am not sure if it is possible to basically gzip the stream of log lines as they arrive at the logback appender, rather than compressing the file when we log-rotate. Is that at all possible, and if so, how to achieve that and is there a lot of benefit of compressing "on the fly" rather than the whole file?
答案1
得分: 1
当然。您可以简单地保持一个gzip压缩进程处于打开状态,并在日志逐行到来时将其输入。这将显著减少日志文件所需的空间,并且平均而言不会占用更多的CPU资源,因为您最终会压缩它。
缺点是,在任何时间点,压缩的日志文件仍然不会包含许多已提供的日志行,因为压缩过程具有一定的延迟性和突发性。在发出压缩块之前,需要累积许多行。其次,在关闭之前,压缩文件将不是一个有效的gzip文件。您仍然可以解压缩其中的内容,但它将不具有带有检查值的尾部。如果进程被终止或机器崩溃,则会得到一个无效的gzip文件,其中没有最近的几行日志。当然,最近的日志行可能恰好是您最感兴趣的内容,以便查明究竟发生了什么。
所有这些缺点都可以通过针对此应用程序的专门方法得到解决,这在gzlog.h/gzlog.c中得到了实现。gzlog确保在写入每行后,gzip日志文件是完整且有效的,并包含该日志行。此外,即使在向gzlog进程添加日志行的过程中断开该进程本身,它也可以重构带有最后提供的日志行的gzip文件。
英文:
Sure. You can simply keep a gzip compression process open and feed it lines as they come in. That would significantly reduce the space required by the log file, and would not take any more CPU resources on average, since you were going to eventually compress it anyways.
The downside is that at any point in time the compressed log file will not yet contain many of the already supplied log lines, since there is a latency and burstiness to the compression process. Many lines will need to be accumulated before a compressed block is emitted. Second, the compressed file will not be a valid gzip file until it is closed. You would still be able to decompress what's there, but it will not have the trailer with a check value. If the process is killed or the machine crashes, you are left with an invalid gzip file that doesn't have the most recent several log lines. Of course, the most recent log lines may be exactly the ones that you're most interested in, to find out what the heck happened.
All of those downsides can be cured with a specialized approach for this application, which is implemented in gzlog.h/gzlog.c. gzlog assures that after each line is written, the gzipped log file is complete and valid, and contains that log line. Furthermore, it can reconstruct the gzip file with the last provided log line even if the gzlog process itself is interrupted in the middle of adding a log line.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论