解决向Cassandra副本集写入时的写入问题。

huangapple go评论51阅读模式
英文:

resolve write problem when write to Cassandra replica set

问题

根据设计,我们可以写入Cassandra副本集中的任何节点。在我的副本集中有2个节点的情况下,如果我对节点A进行写操作,但节点不可用,我是否需要捕获异常,然后手动重新写入节点B?

在MongoDB中,它们的驱动程序具有“可重试写入”功能,如果主节点宕机,可以自动写入另一个节点。Cassandra是否具有此功能?

英文:

By design we can write to any node in the Cassandra replica set. For situation my replica set has 2 node. When I make a write operation to node A but the node is unavailable. Do I have catch exception then re-write to node B manually ?

On mongodb, their Driver have "Retry-able Writes" to auto write to another node if primary node is down. Does Cassandra have this feature ?

答案1

得分: 1

当你写入Cassandra时,你需要指定希望写入的一致性级别,从ANY,它不提供任何保证,到ALL,它要求所有数据中心的所有副本都向协调节点确认。

这次写入发送到一个单一节点,根据你的负载均衡策略,该节点充当整个操作的协调者,并将返回成功/异常的单一响应,你的应用程序不必自己单独将写入发送到多个节点,它只发送到一个节点(可以使用任何节点),该节点协调将写入到副本中。

在使用local_quorum进行写入的普通场景中,副本因子为3,只要协调者有3个节点中的2个提供写入确认,即使第3个节点无法写入数据,应用程序也不会收到异常。

驱动程序上有一个重试策略,可以在超时事件发生时允许重试,但在使用它时应确保操作是幂等的(例如,将项目附加到列表,重试可能导致在副本中的一个列表上出现两次相同的项目)。

由于你特定的副本因子为2,目前你的一致性保证和容错性都不足。

  • one / local_one - 仅保证其中一个节点接收了写入(两者都可能接收到,但没有提供保证)
  • quorum / local_quorum - 需要两个节点都确认,因此无法处理节点故障。

这是因为2的多数是2 - 如果你使用了3个节点并且副本因子为3,那么local_quorum需要3个中的2个,这将允许一个节点宕机同时提供更强的一致性保证。

英文:

When you write to Cassandra you specify the consistency level you wish to write wish - ranging from ANY which provides no guarantees, up to ALL which requests that all replicas in all DCs acknowledge back to the co-ordinator.

This write is sent to a single node - based on your load balancing policy - that node acts as the co-ordinator for the whole operation, and will return a single response of success / exception- your application does not have to itself individually send the write to multiple nodes, it just sends to 1 node (any node can be used) who co-ordinates the write to the replicas.

In a normal scenario of using local_quorum for a write with a very normal replication factor of 3 then as long as the co-ordinator has 2 of the 3 nodes providing acknowledgement of the write, the application will not get any exception - even if the 3rd node fails to write the data.

There is a retry policy available on the driver - which can allow for a retry in the event of a timeout, you should ensure though that the operation is idempotent when using this. (for example, appending an item to a list, retrying could result in the item being on the list twice on one of the replicas).

With your particular replication factor being 2 - you are currently in a position where you are lack consistency guarantees, or resilience.

  • one / local_one - only guarantees one of the nodes got the write. (Both are likely to get it but there is no guarantee provided)
  • quorum / local_quorum - requires both nodes acknowledge, so you have no ability to handle a node failure.

This is because the quorum of 2 is 2 - if you used 3 nodes with an RF=3, then local_quorum requires 2 of the 3, which would allow a node to be down while providing a stronger guarantee on consistency.

huangapple
  • 本文由 发表于 2023年6月2日 14:00:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76387495.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定