在system.log中的Cassandra ReadRepairStage中发生错误。

huangapple go评论62阅读模式
英文:

Error in ReadRepairStage Cassandra in system.log

问题

目前我们看到的是:

ERROR [ReadRepairStage:896014] CassandraDaemon.java:228 - 线程异常 Thread[ReadRepairStage:896014,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: 操作超时 - 在Cassandra system.log中仅收到0个响应。

我们已将read_request_timeout_in_ms的值设置为3秒,这会影响用于维护复制因子的后台数据同步吗?

我尝试增加了集群中1个节点的read_request_timeout_in_ms值,但没有太多改善。

英文:

Currently we are seeing:

ERROR [ReadRepairStage:896014] CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:896014,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses in Cassandra system.log.

We have kept read_request_timeout_in_ms value to 3 seconds, does this impact the background data synching which will be used for maintaining the replication factor.

I have tried increasing the read_request_timeout_in_ms value of 1 node in cluster and there hasn’t been much improvement.

答案1

得分: 1

有许多导致这种问题的原因:

  1. 网络问题
  2. 丢弃的变更
    a. 节点过载
    b. IO(CPU、内存、磁盘)性能问题
  3. 长时间的垃圾回收暂停

您将需要诊断为什么会发生读修复。通常情况下,这是由于某种原因导致的变更丢失。或者是读修复的机会。如果是读修复的机会,请确保将 dc_local_read_repair_chanceread_repair_chance 设置为 0。两者都不是必需的,并且在后续版本中将默认设置为 0。

英文:

There are many reasons for this type of issue:

  1. Network issues
  2. Dropped mutations
    a. Overloaded nodes
    b. IO (CPU, memory, disk) perf issues
  3. Long gc pauses

You'll have to diagnose why the read repair is happening. Typically, it's dropped mutations for whatever reason. Or read repair chance. If it's read repair chance, be sure to set dc_local_read_repair_chance and read_repair_chance to 0. Neither one is needed and it will be set to 0 by default in later builds.

huangapple
  • 本文由 发表于 2023年7月10日 19:36:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76653337.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定