Kafka的磁盘使用量为什么每7天会周期性下降?

huangapple go评论57阅读模式
英文:

Why would Kafka's disk usage cyclically drop every 7 days?

问题

这种模式在另一个集群上看不到,该集群的保留设置(默认)也为7天。

有问题的集群是v2.8.1,并设置了以下配置(其他所有配置均保持默认):

auto.create.topics.enable = false
delete.topic.enable = true
num.partitions = 10

下面是CloudWatch的度量图,作为标题中提到的7天模式的示例。

Kafka的磁盘使用量为什么每7天会周期性下降?
Kafka的磁盘使用量为什么每7天会周期性下降?

上面绘制的度量KafkaDataLogsDiskUsed被定义为如下

用于数据日志的磁盘空间使用百分比。

“数据日志”是否包括Kafka自身/内部日志?因为上述图表的一个解释可能是每7天有一些内部/簿记过程一次性填充“某些”日志(然后在另外7天或retention.ms设置的任何时间后几乎同时丢弃),这种模式在这个小型集群上更加明显,因为没有太多“真实”数据流入。

英文:

Such a pattern isn't seen on another cluster, which also has the (default) retention set to 7 days.

The cluster in question is v2.8.1 and has these configurations set (everything else is left to defaults):

auto.create.topics.enable = false
delete.topic.enable = true
num.partitions = 10

Below is a metric graph from CloudWatch, as an example of the 7 days pattern mentioned in the title.

Kafka的磁盘使用量为什么每7天会周期性下降?
Kafka的磁盘使用量为什么每7天会周期性下降?

The metric KafkaDataLogsDiskUsed graphed above is defined as:
> The percentage of disk space used for data logs.

Does "data logs" entail Kafka's own/internal logs? Because one explanation of the above graphs would be that there's some internal/bookkeeping process which fills in "some" logs every 7 days, at once (and which then get also almost simultaneously discarded after another 7 days, or whatever retention.ms is set to). And this pattern is more visible on this, toy cluster with not much "real" data flowing in.

答案1

得分: 3

正如您提到的,所有Kafka主题的默认保留期限是7天,更重要的是默认的log.cleanup.policy设置为delete(而不是compact)。由于所有Kafka主题的日志都存储在磁盘上,看到您所看到的趋势并不令人意外。

一旦达到默认保留期限(log.retention.ms),它最终会触发一个清理过程来运行并删除所有比默认保留期限旧的段,由于这些段存储在磁盘上,会反映在您的图表中。

如果您需要更多详细信息,可以参考这篇博客文章

英文:

As you mentioned the default retention for all Kafka topics is 7 days and more importantly is the default log.cleanup.policy which is set to delete (as opposed to compact). Since logs for all of your Kafka topics are stored on disk, it's not surprising to see the trend you are seeing.

Once your default retention threshold (log.retention.ms) has been met, it'll eventually trigger a cleanup process to run and delete all of the segments that are older than the default retention, which since these are stored on disk, would be reflected in your graph.

This blog post does an excellent job going over the retention and clean-up process, if you need more detail.

huangapple
  • 本文由 发表于 2023年5月25日 22:15:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76333287.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定