英文:
Deadlock with Django / MYSQL and filter on select_for_update
问题
我在项目中遇到了一些死锁问题,其中有多个进程在数据库中更改相同的对象。
我有一个端点的目的是获取最近未完成的游戏并在存在时进行更改,如果不存在则应该创建一个。我需要确保如果在这个端点接收到2个并发请求,其中一个创建对象,另一个阻止执行直到第一个创建完成,然后第二个将更改已创建的对象。
在我的Django应用程序中,在transaction.atomic()上下文中,我有以下查询:
# 使用列表来强制评估
play = list(Play.objects.select_for_update().filter(
game=self.game,
user=self.user,
discard=False,
finished=False,
)
)
以前使用.last(),但我读到它在数据库查询中执行的ORDER BY 有时会引发死锁问题,所以我尝试了这个。
根据我的理解(可能有缺陷),MYSQL应该在finished索引上获取一个独占的记录锁(我在数据库中有这个索引),这将不允许在此事务发生时创建对象,并且任何尝试创建对象的事务都将被阻止,直到此事务完成。它还应该在行本身上获取记录锁,以便没有事务可以更改其内容。
在重负载测试中,我从未遇到并发问题,它的功能也正常,但是有时我会在数据库中找到与此特定代码有关的死锁:
with transaction.atomic():
Play.objects.select_for_update().get(pk=play.pk)
<Changes to the Play>
play.save()
执行SHOW ENGINE INNODB STATUS
时,我得到了以下报告,我为简化起见进行了编辑:
*** (1) TRANSACTION:
TRANSACTION 901805890, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 116 lock struct(s), heap size 24696, 5 row lock(s)
SELECT `*` FROM `games_play` WHERE (`games_play`.`discard` = 0 AND `games_play`.`finished` = 0 AND `games_play`.`game_id` = 1 AND `games_play`.`user_id` = 28) FOR UPDATE
*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 4268 page no 411 n bits 1616 index games_play_finished_71622b41 of table `test_dev`.`games_play` trx id 901805890 lock_mode X locks rec but not gap
Record lock, heap no 18 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 1; hex 80; asc ;;
1: len 4; hex 8000425c; asc B\;;
Record lock, heap no 19 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 1; hex 80; asc ;;
1: len 4; hex 8000425d; asc B];;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4268 page no 969 n bits 80 index PRIMARY of table `test_dev`.`games_play` trx id 901805890 lock_mode X locks rec but not gap waiting
Record lock, heap no 6 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 4; hex 8000425d; asc B];;
1: len 6; hex 000035c07741; asc 5 wA;;
2: len 7; hex 02000001a42d67; asc -g;;
3: len 1; hex 81; asc ;;
4: len 30; hex 0003006e00190004001d000400210008000029000c2d0004010062657473; asc n ! ) - bets; (total 111 bytes);
5: len 7; hex 63726173685f37; asc crash_7;;
6: len 30; hex 00020073001200070019000e00002700004f00696e697469616c6f6e5f6c; asc s ' O initialon_l; (total 116 bytes);
7: len 8; hex 99b07498ef076fa0; asc t o ;;
8: len 8; hex 99b07498f40dce58; asc t X;;
9: len 1; hex 81; asc ;;
10: len 4; hex 80000001; asc ;;
11: len 4; hex 8000000f; asc ;;
12: SQL NULL;
*** (2) TRANSACTION:
TRANSACTION 901805889, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 1
UPDATE `games_play` SET `finished` = 1, `details` = '{}' , `game_token` = '1234' , `jwt_token` = NULL, `game_id` = 1, `user_id` = 15, `created` = '2023-06-26' , `modified` = '2023-06-26' , `discard` = 1 WHERE `games_play`.`id` = 16989
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4268 page no 969 n bits 80 index PRIMARY of table `test_dev`.`games_play` trx id 901805889 lock_mode X locks rec but not gap
Record lock, heap no 6 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 4; hex 8000425d; asc B];;
1: len 6; hex 000035c07741; asc 5 wA;;
2: len 7; hex 02000001a42d67; asc -g;;
3: len 1; hex 81; asc ;;
4: len 30; hex 0003006e00190004001d000400210008000029000c2d0004010062657473; asc n ! ) - bets; (total 111 bytes);
5: len 7; hex 63726173685f37; asc crash_7;;
6: len 30; hex 00020073001200070019000e000
<details>
<summary>英文:</summary>
I am getting some deadlocks in my project where I have multiple processes altering the same objects in the database.
I have an endpoint whose point is to fetch the most recent unfinished play and then proceed to alter it if one exists, if not then it should create one. I need to make sure that if I receive 2 concurrent requests in this endpoint one of them creates the object and the other one blocks execution until that one is created and then the second one will alter the object created.
In my django application I have the following query for that inside transaction.atomic() context:
Use list to force evaluation
play = list(Play.objects.select_for_update().filter(
game=self.game,
user=self.user,
discard=False,
finished=False,
)
)
It used to be .last() but I read that the `ORDER BY` it performs in the database query would sometimes raise problems with deadlocks so I tried this instead.
From my understanding (which is probably flawed) MYSQL should acquire an exclusive record lock on the finished index (I have that as index in my database) which would not allow for an object to be created while this transaction is occurring and any transaction that tries to would be blocked waiting for this one to complete. It should also acquire a record lock on the row itself so that no transaction can change its content
In heavy load testing I never had concurrency issues and it works as intended, however I sometimes find deadlocks in the databases which I dont fully understand with this particular code:
with transaction.atomic():
Play.objects.select_for_update().get(pk=play.pk)
<Changes to the Play>
play.save()
When doing `SHOW ENGINE INNODB STATUS` I got the following report which I edited for simplicity's sake:
*** (1) TRANSACTION:
TRANSACTION 901805890, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 116 lock struct(s), heap size 24696, 5 row lock(s)
SELECT *
FROM games_play
WHERE (games_play
.discard
= 0 AND games_play
.finished
= 0 AND games_play
.game_id
= 1 AND games_play
.user_id
= 28) FOR UPDATE
*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 4268 page no 411 n bits 1616 index games_play_finished_71622b41 of table test_dev
.games_play
trx id 901805890 lock_mode X locks rec but not gap
Record lock, heap no 18 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 1; hex 80; asc ;;
1: len 4; hex 8000425c; asc B;;
Record lock, heap no 19 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 1; hex 80; asc ;;
1: len 4; hex 8000425d; asc B];;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4268 page no 969 n bits 80 index PRIMARY of table test_dev
.games_play
trx id 901805890 lock_mode X locks rec but not gap waiting
Record lock, heap no 6 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 4; hex 8000425d; asc B];;
1: len 6; hex 000035c07741; asc 5 wA;;
2: len 7; hex 02000001a42d67; asc -g;;
3: len 1; hex 81; asc ;;
4: len 30; hex 0003006e00190004001d000400210008000029000c2d0004010062657473; asc n ! ) - bets; (total 111 bytes);
5: len 7; hex 63726173685f37; asc crash_7;;
6: len 30; hex 00020073001200070019000e00002700004f00696e697469616c6f6e5f6c; asc s ' O initialon_l; (total 116 bytes);
7: len 8; hex 99b07498ef076fa0; asc t o ;;
8: len 8; hex 99b07498f40dce58; asc t X;;
9: len 1; hex 81; asc ;;
10: len 4; hex 80000001; asc ;;
11: len 4; hex 8000000f; asc ;;
12: SQL NULL;
*** (2) TRANSACTION:
TRANSACTION 901805889, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 1
UPDATE games_play
SET finished
= 1, details
= '{}', game_token
= '1234', jwt_token
= NULL, game_id
= 1, user_id
= 15, created
= '2023-06-26', modified
= '2023-06-26', discard
= 1 WHERE games_play
.id
= 16989
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4268 page no 969 n bits 80 index PRIMARY of table test_dev
.games_play
trx id 901805889 lock_mode X locks rec but not gap
Record lock, heap no 6 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 4; hex 8000425d; asc B];;
1: len 6; hex 000035c07741; asc 5 wA;;
2: len 7; hex 02000001a42d67; asc -g;;
3: len 1; hex 81; asc ;;
4: len 30; hex 0003006e00190004001d000400210008000029000c2d0004010062657473; asc n ! ) - bets; (total 111 bytes);
5: len 7; hex 63726173685f37; asc crash_7;;
6: len 30; hex 00020073001200070019000e00002700004f00696e697469616c6f6e5f6c; asc s ' O initialon_l; (total 116 bytes);
7: len 8; hex 99b07498ef076fa0; asc t o ;;
8: len 8; hex 99b07498f40dce58; asc t X;;
9: len 1; hex 81; asc ;;
10: len 4; hex 80000001; asc ;;
11: len 4; hex 8000000f; asc ;;
12: SQL NULL;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4268 page no 411 n bits 1616 index games_play_finished_71622b41 of table test_dev
.games_play
trx id 901805889 lock_mode X locks rec but not gap waiting
Record lock, heap no 19 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 1; hex 80; asc ;;
1: len 4; hex 8000425d; asc B];;
From what I understood from this report, the first transaction acquired the lock on the finished index to prevent insertion and was trying to acquire the lock on the play it self but that was already being held by the other transaction which also wanted to acquire the lock on the finished index. But these plays belong to different users which in my mind the first query should never acquire a lock to that.
I would like to understand what is going on here and what could be done to prevent this.
</details>
# 答案1
**得分**: 1
只有一个名为 `finished` 的单列索引,例如:
```py
class Play(models.Model):
finished = models.BooleanField(db_index=True)
...
在这种情况下,选择未完成的 Play
并使用 .get(pk=play.pk)
锁定首先会锁定 pk
的 PRIMARY
索引,然后尝试锁定 finished = 0
的索引记录。
删除它,并在 ["user_id", "finished"]
上添加索引:
class Play(models.Model):
finished = models.BooleanField()
...
class Meta:
indexes = [
models.Index(fields=["user_id", "finished"]),
]
在这种情况下,索引锁将针对不同的用户,例如 user_id = 15 AND finished = 0
。
记得运行迁移:
python manage.py makemigrations
python manage.py migrate
英文:
It appears you only have a single-column index on finished
, e.g.:
class Play(models.Model):
finished = models.BooleanField(db_index=True)
...
In that case, selecting-for-update an unfinished Play
with .get(pk=play.pk)
will first lock the PRIMARY
index for the pk
, and then attempt to lock the finished = 0
index record.
Remove it and add an index on ["user_id", "finished"]
:
class Play(models.Model):
finished = models.BooleanField()
...
class Meta:
indexes = [
models.Index(fields=["user_id", "finished"]),
]
In this case, the index lock will be for separate users, e.g. user_id = 15 AND finished = 0
.
Remember to run migrations:
python manage.py makemigrations
python manage.py migrate
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论