Cassandra在节点在事务中间死机/离线时的一致性问题

huangapple go评论89阅读模式
英文:

Cassandra consistency issue when node dies/offline in between transaction

问题

假设我们有一个拥有3个复制因子的6个节点集群的情况。现在我们正在使用"quorum"写入,因此假设协调器看到需要传输数据的3个节点是正常的,但是当它将数据发送到n1、n2和n3时,n1和n2停止工作,因此写操作将失败,因为没有满足"quorum"的条件,但n3将具有此失败的更新数据,因为Cassandra没有回滚功能。

之后,n1和n2重新启动,但具有较旧的数据。

现在如果进行读取,在读取修复中,位于n3上的最新数据(失败的更新数据)将被复制到n1和n2,我的理解是否正确?

英文:

I have a scenario

Lets assume we have 6 nodes in a cluster with replication factor of 3.
Now we are writing with quorum so lets say the coordinator see that 3 nodes are up where data needs to go but as and when it sends the data to 3 nodes n1, n2, n3. n1 and n2 stop working and thus the write operation will fail as quorum is not met but n3 will have the data of this failed upsert because cassandra has no rollback.

After this n1 and n2 come up but having older data.

Now if read is done, in read repair the latest data (of failed upsert) present on n3 will get replicated to n1 and n2 is my understanding correct?

答案1

得分: 2

是的,你说得对。

在某个记录运行读修复的情况下,数据将从n3复制。这将取决于您对read_repair_chance的配置以及记录被查询的频率。

如果该记录没有被频繁查询,读修复可能不太可能运行,您将不得不等待修复完成。

如果您不定期运行nodetool repair,您应该开始这样做!

请注意,在修复之前以QUORUM一致性进行读取,您仍将获得旧值。

英文:

Yes, you are correct.

In the case a read repair runs for that record, the data will be replicated from n3. This will depend on your configuration of read_repair_chance and how often the record is queried.

If the record isn't being queried a lot it's unlikely for a read repair to run and you will have to wait for your repair to run.

If you don't run nodetool repair on a regular schedule, you should start doing so!

Note that if you read with QOURUM consistency before a repair has occurred you will still get the old value.

答案2

得分: 2

以下是翻译好的内容:

有两种不同类型的错误。如果 n1 和 n2 处于 DOWN 状态,写操作甚至不会传递到 n3,并且会收到不可用异常而无法执行。

如果在写操作开始后或发生灾难性损失时 n1 和 n2 被关闭,数据仍将存在于 n3 上,这时会收到一个 WriteTimeoutException,因为 n3 协调器将等待其他两个副本的响应。在写操作超时的情况下,您需要以不同的方式处理错误,通常是通过串行读取(如果使用 LWT)或其他类型的读取。通常情况下,写操作是幂等的,您可以尝试再次执行。

在使用 QUORUM 和 rf=3 的情况下,您只能安全地处理一个节点的故障。一旦有两个副本关闭,您可能会遇到许多潜在问题,甚至可能导致数据丢失。这就是为什么许多人在数据真正重要的情况下会使用多个数据中心和更高的复制因子,这样即使失去一个数据中心,也不会影响数据的安全性。

英文:

Theres two different kinds of errors. If n1 and n2 are DOWN the write wont even go to n3 either and you will get an Unavailable exception and no issue.

If n1 and n2 go down AFTER the write started or had some catastrophic loss then the data will still exist on n3 and you will get a WriteTimeoutException as the n3 coordinator will be sitting and waiting for other 2 replicas to respond. On write timeouts you need to handle the error differently, usually its checking with a serial read (if using LWT) or another kinda of read. Usually though the writes are idempotent and you can just try again.

With QUORUM and rf=3 you can only really safely handle one node going down. Once you get 2 replicas down your going to have lots of potential issues up to even data loss. This is why many use multiple DCs and higher replication factors if data is really important, that way losing a DC even wont impact things.

huangapple
  • 本文由 发表于 2020年1月3日 19:36:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/59577937.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定