英文:
How do logging frameworks like Log4j guarantee log statement ordering?
问题
这个问题困扰了我一段时间,流行的日志记录框架(比如Log4j)如何在允许并发异步记录的情况下,保证日志顺序的顺序而不会出现性能瓶颈。换句话说,如果日志语句L1
在日志语句L2
之前被调用,那么L1
在日志文件中一定会出现在L2
之前。
我知道Log4j2使用了环形缓冲区和序列号,但是这仍然不直观地解释了它是如何解决这个问题的。
有人能给出一个直观的解释,或者指引我去找一个做了相同解释的资源吗?
英文:
This question has been bugging me for a while, how do popular logging frameworks like Log4j which allow concurrent, async logging order guarantee of log order without performance bottlenecks, i.e if log statement L1
was invoked before log statement L2
, L1
is guaranteed to be in the log file before L2
.
I know Log4j2 uses a ring buffer and sequence numbers, but it still isn't intuitive how this solves the problem.
Could anyone give an intuitive explanation or point me to a resource doing the same?
答案1
得分: 3
这完全取决于你对于“日志顺序”的理解。当讨论单个线程时,日志顺序会得到保留,因为每次日志调用都会导致一次写入。
当以异步方式记录日志时,每个日志事件都会按接收顺序添加到队列中,并按先入先出的顺序进行处理,无论它是如何进入队列的。这并不是非常具有挑战性,因为写入操作是单线程的。
然而,如果你在谈论跨线程的日志顺序,那是无法保证的,即使在同步记录日志时也是如此,因为不可能保证。线程1可能在线程2之前开始记录,但线程2可能在写入同步点时超越线程1。同样,在将事件添加到队列时也可能发生相同的情况。在日志方法中锁定日志调用可以保持顺序,但几乎没有好处,并且会带来灾难性的性能后果。
在多线程环境中,你完全可能会看到时间戳顺序混乱的日志事件,因为线程1解析了时间戳,然后被线程2中断,线程2随后解析时间戳并记录事件。然而,如果将日志写入诸如ElasticSearch之类的存储中,你可能永远不会注意到这一点,因为它会按时间戳对它们进行排序。
英文:
This all depends on what you mean by "logging order". When talking about a single thread the logging order is preserved because each logging call results in a write.
When logging asynchronously each log event is added to a queue in the order it was received and is processed in First-in/First-out order, regardless of how it got there. This isn't really very challenging because the writer is single-threaded.
However, if you are talking about logging order across threads, that is never guaranteed - even when logging synchronously - because it can't be. Thread 1 could start to log before Thread 2 but thread 2 could get to the synchronization point in the write ahead of thread 1. Likewise, the same could occur when adding events to the queue. Locking the logging call in the logging method would preserve order, but for little to no benefit and with disastrous performance consequences.
In a multi-threaded environment it is entirely possible that you might see logging events where the timestamp is out of order because Thread 1 resolved the timestamp, was interrupted by thread 2 which then resolved the timestamp and logged the event. However, if you write your logs to something like ElasticSearch you would never notice since it orders them by timestmap.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论