英文:
Leak detected issue in cassandra 3.11.1
问题
我们运行了一个包含36个节点的Cassandra集群。突然之间,我们开始在2个Cassandra节点的日志中看到这个错误。我们还注意到这些节点的可用存储空间图表出现了波动。我无法理解这个问题的根本原因。任何帮助都将不胜感激。
ERROR [CompactionExecutor:164360] 2023-05-04 00:30:58,094 CassandraDaemon.java:228 - 线程中的异常 Thread[CompactionExecutor:164360,1,main]
java.lang.IllegalArgumentException: null
at java.nio.Buffer.position(Buffer.java:244) ~[na:1.8.0_131]
at org.apache.cassandra.io.util.SafeMemoryWriter.reallocate(SafeMemoryWriter.java:59) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.SafeMemoryWriter.setCapacity(SafeMemoryWriter.java:68) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.IndexSummaryBuilder.prepareToCommit(IndexSummaryBuilder.java:250) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.big.BigTableWriter$IndexWriter.doPrepare(BigTableWriter.java:524) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:364) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:379) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:111) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:121) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:220) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:268) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
WARN [CompactionExecutor:164909] 2023-05-04 00:31:54,573 IndexSummaryBuilder.java:115 - 128的min_index_interval对于平均大小为64的4298508892个预期键来说太低了;改为使用145的间隔
ERROR [Reference-Reaper:1] 2023-05-04 00:32:02,326 Ref.java:224 - 检测到泄漏:引用(org.apache.cassandra.utils.concurrent.Ref$State@4f374614)到类org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1106289026:Memory@[7fa6ab9ec010..7fa72ad798d0)在引用被垃圾回收之前没有被释放
英文:
We are running 36 nodes cassandra cluster. Suddenly, we started seeing this error in 2 Cassandra node logs. We are also seeing fluctuation in storage available graphs for these nodes. I am unable to understand the root cause of it. Any help is appreciated.
ERROR [CompactionExecutor:164360] 2023-05-04 00:30:58,094 CassandraDaemon.java:228 - Exception in thread Thread[CompactionExecutor:164360,1,main]
java.lang.IllegalArgumentException: null
at java.nio.Buffer.position(Buffer.java:244) ~[na:1.8.0_131]
at org.apache.cassandra.io.util.SafeMemoryWriter.reallocate(SafeMemoryWriter.java:59) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.SafeMemoryWriter.setCapacity(SafeMemoryWriter.java:68) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.IndexSummaryBuilder.prepareToCommit(IndexSummaryBuilder.java:250) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.big.BigTableWriter$IndexWriter.doPrepare(BigTableWriter.java:524) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:364) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:379) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:111) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:121) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:220) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:268) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
WARN [CompactionExecutor:164909] 2023-05-04 00:31:54,573 IndexSummaryBuilder.java:115 - min_index_interval of 128 is too low for 4298508892 expected keys of avg size 64; using interval of 145 instead
ERROR [Reference-Reaper:1] 2023-05-04 00:32:02,326 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@4f374614) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1106289026:Memory@[7fa6ab9ec010..7fa72ad798d0) was not released before the reference was garbage collected```
</details>
# 答案1
**得分**: 1
有几件事情发生了,而内存警告可能是我最不担心的一个。
警告 [CompactionExecutor:164909] 2023-05-04 00:31:54,573 IndexSummaryBuilder.java:115
- 128的min_index_interval对于平均大小为64的4298508892个预期键来说太低了;
改为使用145的间隔。
这个关于索引摘要的警告实际上指示了一个严重的问题,尽管由于触发它需要大量的数据,所以它不太常见,在某些情况下,警告不会触发 - 即使它已经破坏了索引摘要。
当表的索引摘要破裂时,读取响应时间会受到影响,可能会被迫开始扫描sstables,因为摘要不再起作用。这相对容易修复。
ALTER TABLE your_ks.your_table WITH min_index_interval = 256;
确定是哪个表有点困难,因为它没有指示是哪个表,但显示超过42亿键的分区键计数应该相对容易识别。
由于您所使用的旧版本,合并问题本身很难确定原因/影响,我建议首先修复索引摘要、补丁级别,然后检查问题是否再次发生。
您还需要开始将一些任务添加到您的运维待办事项中:
* 3.11.1版本已经相当过时,请升级到3.11.15版本。
* Java 1.8.131版本非常老旧,请将其升级到较新的版本。
* 开始计划升级到Cassandra 4,当Cassandra 5发布时,3.11将不再获得社区支持。 (https://cassandra.apache.org/_/blog/Apache-Cassandra-3.0.x-and-3.11.x-End-of-Life-Announcement.html)
<details>
<summary>英文:</summary>
There are a few things going on there, and the memory warning is probably the one that concerns me the least.
WARN [CompactionExecutor:164909] 2023-05-04 00:31:54,573 IndexSummaryBuilder.java:115
- min_index_interval of 128 is too low for 4298508892 expected keys of avg size 64;
using interval of 145 instead
This warning on the index summaries is actually indicating a serious issue, although it is not seen too commonly due to the volume of data needed to trigger it, and in some scenarios the warning does not trigger - even though it has broken the index summaries.
When the index summary breaks for the table, read response times will suffer and it can be forced to start scanning sstables because the summary no longer operates. This is relatively easy to fix.
ALTER TABLE your_ks.your_table WITH min_index_interval = 256;
Which table is a bit harder, since it does not indicate which, but the partition key count which shows over 4.2bn keys should make it relatively easy to identify.
The compaction issue itself, due to the old release that you are on, along with broken summaries, its difficult to ascertain cause / effect on that one and I would recommend fixing the index summaries, patch level and then check for re-occurrences of the issue.
You need to also start adding to your operational backlog a number of tasks:
* 3.11.1 is significantly out of date, please upgrade to 3.11.15
* Java 1.8.131 is very old - please update that to a newer release.
* Start planning for an upgrade to Cassandra 4, 3.11 will no longer be community supported when Cassandra 5 is released. (https://cassandra.apache.org/_/blog/Apache-Cassandra-3.0.x-and-3.11.x-End-of-Life-Announcement.html )
</details>
# 答案2
**得分**: 0
以下是您要翻译的内容:
"It's a new issue for me, but you can try to configurate the `Garbage Collector`.
I seperaded some important links to this process:
***Configurate:***
https://docs.datastax.com/en/dse/6.8/dse-admin/datastax_enterprise/operations/opsTuningGcAbout.html
***Tunning:***
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsTuneJVM.html
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html
***Logs:***
https://cassandra.apache.org/doc/3.11/cassandra/troubleshooting/reading_logs.html
***Monitoring:***
https://medium.com/@mlowicki/monitoring-cassandra-garbage-collector-83c8a515e403"
<details>
<summary>英文:</summary>
It's a new issue for me, but you can try to configurate the `Garbage Collector`.
I seperaded some important links to this process:
***Configurate:***
https://docs.datastax.com/en/dse/6.8/dse-admin/datastax_enterprise/operations/opsTuningGcAbout.html
***Tunning:***
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsTuneJVM.html
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html
***Logs:***
https://cassandra.apache.org/doc/3.11/cassandra/troubleshooting/reading_logs.html
***Monitoring:***
https://medium.com/@mlowicki/monitoring-cassandra-garbage-collector-83c8a515e403
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论