
huangapple go评论73阅读模式

Making a journal file in golang



我已经在谷歌上搜索了有关如何实现日志文件的指南,但搜索结果中有很多Oracle RDBMS文档等无关内容。




I have a small project in Go that are receiving text lines over tcp to process. However, to ensure robustness, I want to create some sort of journal so that nothing is lost in case of power failure (e.g. a frame of data is received by my app, but is not yet processed).

I have googled for any guides on how a journal file should be implemented, but the search results are heavily polluted by Oracle RDBMS documentation and such.

My tought was something like: immediately after receiving a line, write it to a file with a "not processed flag". After processing, update the file so that this flag is cleared, opening for overwrites. At the same time as this flag is cleared, send an "processed ack" to the data sender. Perhaps its easiest to deal with fixed size "slots" in the journal to ensure that I can reuse freed slots rather than having a ever-increasing file and maintain a "free list" of unused slots.

Is there any "best practice" for implementing such files in custom code, i.g.e with regards to file structure, padding and locking? Are there any concerns doing so in Go as it is cross-platform rather than using native file-system APIs?


得分: 5



  1. 接收消息。

  2. 将其写入日志。

  3. 根据一致性要求,可选择立即对日志执行fsync操作。

  4. 可选择发送“接收确认” - 根据需求。

  5. 处理消息。

  6. 可选择将另一个带有记录ID的“已处理”记录写入文件。不一定总是需要这个,但这是你不重写旧记录的地方。或者,您可以写一个单独的文件,其中包含您已处理的“顶部事务ID”,这样在发生故障时,您将自动知道从哪里开始重新处理。这将减小日志的大小。

  7. 发送“处理确认”或“处理失败” - 再次取决于您的需求。

数据库通常允许您控制fsync行为 - 每次写入、每隔N秒、当操作系统决定时 - 这是速度与持久性之间的权衡。


[编辑] 关于这个主题的另一篇很棒的文章 - http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

至于Go方面 - 有几种选项可以写入文件,从低级文件处理器到缓冲写入器。当然,文件处理器会让你更多地控制底层发生的情况。我不确定Go中的普通文件写入器在幕后有多少缓存,如果你打算使用它,建议你阅读代码。


You shouldn't rewrite a journal. Just append the operations to it so that you can recreate them, and then control the strictness level you want.

The logic should simply be:

  1. receive message

  2. write it to journal

  3. optionally do an fsync on the journal now - depending on your consistency requirements.

  4. optionally then send a "received ack" - depends on your needs.

  5. process the message.

  6. optionally write another "processed" record to the file with an id of the record. you don't always need that but this where you don't rewrite the old record. Alternatively you can write a separate file with the "top transaction id" you've processed, so you'll automatically know where to begin processing again in case of a failure. this will reduce the journal size.

  7. send a "processed ack" or "processing failure" - again, depends on what you want.

Databases usually let you control the fsync behavior - every write, every N seconds, when the os decides - it's a matter of speed vs. durability.

A good read on the subject might be this post on redis persistence:

[EDIT] another great read on the subject - http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

As for the Go aspect of it - there are a few options of writing to files, from a low level file handler to a buffered writer. Of course a file handler will keep you most in control of what's going on under the hood. I'm not sure how much caching behind the scenes a normal file writer in Go does, I'd suggest you read the code if you intend to use it.

  • 本文由 发表于 2014年5月8日 14:48:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/23534691.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
