英文:
Error in ReadRepairStage Cassandra in system.log
问题
目前我们看到的是:
ERROR [ReadRepairStage:896014] CassandraDaemon.java:228 - 线程异常 Thread[ReadRepairStage:896014,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: 操作超时 - 在Cassandra system.log中仅收到0个响应。
我们已将read_request_timeout_in_ms
的值设置为3秒,这会影响用于维护复制因子的后台数据同步吗?
我尝试增加了集群中1个节点的read_request_timeout_in_ms
值,但没有太多改善。
英文:
Currently we are seeing:
ERROR [ReadRepairStage:896014] CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:896014,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses in Cassandra system.log.
We have kept read_request_timeout_in_ms
value to 3 seconds, does this impact the background data synching which will be used for maintaining the replication factor.
I have tried increasing the read_request_timeout_in_ms
value of 1 node in cluster and there hasn’t been much improvement.
答案1
得分: 1
有许多导致这种问题的原因:
- 网络问题
- 丢弃的变更
a. 节点过载
b. IO(CPU、内存、磁盘)性能问题 - 长时间的垃圾回收暂停
您将需要诊断为什么会发生读修复。通常情况下,这是由于某种原因导致的变更丢失。或者是读修复的机会。如果是读修复的机会,请确保将 dc_local_read_repair_chance
和 read_repair_chance
设置为 0。两者都不是必需的,并且在后续版本中将默认设置为 0。
英文:
There are many reasons for this type of issue:
- Network issues
- Dropped mutations
a. Overloaded nodes
b. IO (CPU, memory, disk) perf issues - Long gc pauses
You'll have to diagnose why the read repair is happening. Typically, it's dropped mutations for whatever reason. Or read repair chance. If it's read repair chance, be sure to set dc_local_read_repair_chance
and read_repair_chance
to 0. Neither one is needed and it will be set to 0 by default in later builds.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论