英文:
DynamoDB TransactWriteItem Concurrency Issue
问题
我们的系统使用DynamoDB Streams来触发一个处理程序,每当在表A中添加新记录时,该处理程序会在表B中创建两条记录。
表"B"看起来像这样
每个B#root记录应该有另一个parent#记录,以确保该项目的名称与父项目的唯一性。父项目是8954,我们从表A的流中获取。
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#1234 | B#root | Default | 8954;Default |
Parent#8954;Default | Parent#8954;Default |
然而,我们注意到有时DynamoDB Streams会发送两个非常接近的事件来创建新的A记录,导致我们的处理程序创建了三个B记录而不是两个。我们怀疑存在并发问题,但我们不确定如何解决它。
处理程序将执行以下代码两次。
HashMap<String, AttributeValue> rootItem = new HashMap<String, AttributeValue>();
rootItem.put("hashKey", new AttributeValue("B#"+UUID.randomUUID().toString()));
rootItem.put("sortKey", new AttributeValue("B#root"));
rootItem.put("name","Default");
rootItem.put("parentValue",<从DynamoDB Streams获得的ID>+";Default" );
HashMap<String, AttributeValue> parentItem = new HashMap<String, AttributeValue>();
parentItem.put("hashKey", new AttributeValue("Parent#"+<从DynamoDB Streams获得的ID>+";Default"));
parentItem.put("sortKey", new AttributeValue("Parent#"+<从DynamoDB Streams获得的ID>+";Default"));
Put bRootItem = new Put()
.withTableName("B")
.withItem(rootItem);
String conditionalCheckExpression = "attribute_not_exists('hashKey') AND attribute_not_exists('sortKey')";
Put bParentIrem = new Put()
.withTableName("B")
.withItem(parentItem)
.withConditionExpression(conditionalCheckExpression);
Collection<TransactWriteItem> actions = Arrays.asList(
new TransactWriteItem().withPut(bRootItem),
new TransactWriteItem().withPut(bParentIrem));
这将导致以下记录!
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123456 | B#root | Default | 658;Default |
B#123689 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
我认为这里存在并发问题,对吗?
我们的目标是使处理程序具有幂等性,并且最多为每个A记录创建两个B记录。我们在Put操作中添加了条件表达式,以防止重复创建父项目,但似乎未按预期工作。
我期望的是只有2条记录
这些:
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123456 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
或者
这些
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123689 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
您能解释为什么会发生这种情况并建议解决我们的问题的方法吗?
英文:
Our system uses DynamoDB Streams to trigger a handler that creates two records in Table B whenever a new record is added to Table A
Table "B" looks like that
each B#root record should have another parent# record to make sure of the uniqueness of the name of this item with the parent.
The parent is 8954 which we get from the stream of the table A
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#1234 | B#root | Default | 8954;Default |
Parent#8954;Default | Parent#8954;Default |
However, we have noticed that sometimes DynamoDB Streams sends two very closely timed events for a new A record creation, causing our handler to create three B records instead of two. We suspect there is a concurrency issue, but we are not sure how to solve it.
the handler will do the following code twice.
HashMap<String, AttributeValue> rootItem = new HashMap<String, AttributeValue>();
rootItem.put("hashKey", new AttributeValue("B#"+UUID.randomUUID().toString()));
rootItem.put("sortKey", new AttributeValue("B#root"));
rootItem.put("name","Default");
rootItem.put("parentValue",<Id given from dynamodb streams>+";Default" );
HashMap<String, AttributeValue> parentItem = new HashMap<String, AttributeValue>();
parentItem.put("hashKey", new AttributeValue("Parent#"+<Id given from dynamodb streams>+";Default"));
parentItem.put("sortKey", new AttributeValue("Parent#"+<Id given from dynamodb streams>+";Default"));
Put bRootItem = new Put()
.withTableName("B")
.withItem(rootItem);
String conditionalCheckExpression = "attribute_not_exists('hashKey') AND attribute_not_exists('sortKey')";
Put bParentIrem = new Put()
.withTableName("B")
.withItem(parentItem)
.withConditionExpression(conditionalCheckExpression);
Collection<TransactWriteItem> actions = Arrays.asList(
new TransactWriteItem().withPut(bRootItem),
new TransactWriteItem().withPut(bParentIrem));
and that will result in the following records!
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123456 | B#root | Default | 658;Default |
B#123689 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
I think there is a concurrency issue here right ?
Our goal is for the handler to be idempotent and only create two B records at most for each A record. We added a condition expression to the Put operation to prevent duplicate creation of a parent item, but it does not seem to be working as expected.
What I am expecting is just 2 records
These:
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123456 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
or
These
hashKey | sortKey | name | parentValue |
---|---|---|---|
B#123689 | B#root | Default | 658;Default |
Parent#658;Default | Parent#658;Default |
Can you explain why this is happening and suggest a solution to our problem?
答案1
得分: 0
不可能发生这样的情况,即包含对项目存在性的条件检查并评估为假的事务仍会创建子项。事务符合ACID标准,其原子性将确保整个事务失败。
可能有另一个进程正在添加子项,您可能不知道。在这种情况下,我的建议是在表上启用数据平面日志记录,并评估与该项目相关的所有操作,日志将显示有关API调用起源的更多信息,以便您可以进一步调试。
英文:
Its impossible that a transaction which contains a condition check on the existence of an item which evaluates to false would still create the child. Transactions are ACID compliant and the atomic nature would ensure that the entire transaction failed.
It may be possible that another process is adding children, which you are not aware of. In that case my suggestion would be to enable dataplane logging on the table and evaluate all actions pertaining that item, the logs will show you more info on where the API calls originated so you can debug further.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论