英文:
resolve write problem when write to Cassandra replica set
问题
根据设计,我们可以写入Cassandra副本集中的任何节点。在我的副本集中有2个节点的情况下,如果我对节点A进行写操作,但节点不可用,我是否需要捕获异常,然后手动重新写入节点B?
在MongoDB中,它们的驱动程序具有“可重试写入”功能,如果主节点宕机,可以自动写入另一个节点。Cassandra是否具有此功能?
英文:
By design we can write to any node in the Cassandra replica set. For situation my replica set has 2 node. When I make a write operation to node A but the node is unavailable. Do I have catch exception then re-write to node B manually ?
On mongodb, their Driver have "Retry-able Writes" to auto write to another node if primary node is down. Does Cassandra have this feature ?
答案1
得分: 1
当你写入Cassandra时,你需要指定希望写入的一致性级别,从ANY
,它不提供任何保证,到ALL
,它要求所有数据中心的所有副本都向协调节点确认。
这次写入发送到一个单一节点,根据你的负载均衡策略,该节点充当整个操作的协调者,并将返回成功/异常的单一响应,你的应用程序不必自己单独将写入发送到多个节点,它只发送到一个节点(可以使用任何节点),该节点协调将写入到副本中。
在使用local_quorum
进行写入的普通场景中,副本因子为3,只要协调者有3个节点中的2个提供写入确认,即使第3个节点无法写入数据,应用程序也不会收到异常。
驱动程序上有一个重试策略,可以在超时事件发生时允许重试,但在使用它时应确保操作是幂等的(例如,将项目附加到列表,重试可能导致在副本中的一个列表上出现两次相同的项目)。
由于你特定的副本因子为2,目前你的一致性保证和容错性都不足。
one
/local_one
- 仅保证其中一个节点接收了写入(两者都可能接收到,但没有提供保证)quorum
/local_quorum
- 需要两个节点都确认,因此无法处理节点故障。
这是因为2的多数是2 - 如果你使用了3个节点并且副本因子为3,那么local_quorum
需要3个中的2个,这将允许一个节点宕机同时提供更强的一致性保证。
英文:
When you write to Cassandra you specify the consistency level you wish to write wish - ranging from ANY
which provides no guarantees, up to ALL
which requests that all replicas in all DCs acknowledge back to the co-ordinator.
This write is sent to a single node - based on your load balancing policy - that node acts as the co-ordinator for the whole operation, and will return a single response of success / exception- your application does not have to itself individually send the write to multiple nodes, it just sends to 1 node (any node can be used) who co-ordinates the write to the replicas.
In a normal scenario of using local_quorum
for a write with a very normal replication factor of 3 then as long as the co-ordinator has 2 of the 3 nodes providing acknowledgement of the write, the application will not get any exception - even if the 3rd node fails to write the data.
There is a retry policy available on the driver - which can allow for a retry in the event of a timeout, you should ensure though that the operation is idempotent when using this. (for example, appending an item to a list, retrying could result in the item being on the list twice on one of the replicas).
With your particular replication factor being 2 - you are currently in a position where you are lack consistency guarantees, or resilience.
one
/local_one
- only guarantees one of the nodes got the write. (Both are likely to get it but there is no guarantee provided)quorum
/local_quorum
- requires both nodes acknowledge, so you have no ability to handle a node failure.
This is because the quorum of 2 is 2 - if you used 3 nodes with an RF=3, then local_quorum
requires 2 of the 3, which would allow a node to be down while providing a stronger guarantee on consistency.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论