Notes on Using "Select ... For Update" for Uniqueness in Mysql

I encountered a deadlock recently. Similar questions have been asked on StackOverflow, e.g., this and this. But the answers didn’t really explain why this happens.

The situation is quite easy to reproduce @ Mysql 5.7.17 (also tested on other versions in 5.5 or 5.6):

1
2
3
4
5
6
7
8
CREATE TABLE `test` (
`id` bigint(11) NOT NULL AUTO_INCREMENT,
`val` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `search` (`val`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;

insert into test set val='pre-lock';

session1

1
2
start transaction;
select * from test where val='pre-lock' for update;

session2

1
2
start transaction;
select * from test where val='pre-lock' for update;

session1

1
insert into test set val='/a/b/c';

session2

1
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

The result of show engine innodb status:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
LATEST DETECTED DEADLOCK
------------------------
2017-04-06 23:54:03 0x7000057db000
*** (1) TRANSACTION:
TRANSACTION 1333, ACTIVE 18 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 5, OS thread handle 123145394155520, query id 62 localhost root Sending data
select * from test where val='pre-lock' for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 24 page no 4 n bits 72 index search of table `test_tnx`.`test` trx id 1333 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 7072652d6c6f636b; asc pre-lock;;
1: len 8; hex 8000000000000001; asc ;;

*** (2) TRANSACTION:
TRANSACTION 1332, ACTIVE 29 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 1
MySQL thread id 62, OS thread handle 123145394434048, query id 63 localhost root update
insert into test set val='/a/b/c'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 24 page no 4 n bits 72 index search of table `test_tnx`.`test` trx id 1332 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 7072652d6c6f636b; asc pre-lock;;
1: len 8; hex 8000000000000001; asc ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 24 page no 4 n bits 72 index search of table `test_tnx`.`test` trx id 1332 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 7072652d6c6f636b; asc pre-lock;;
1: len 8; hex 8000000000000001; asc ;;

*** WE ROLL BACK TRANSACTION (1)

My objective is to use select ... for update as a uniqueness check for a following sequence of insertions. I expected that Tnx 2 would wait until Tnx 1 released the lock, and then continue its own business. However, Tnx 2 is rolled back due to deadlock. The innodb status looks quite confusing. Tnx 1 is holding and waiting for the same lock.

After some research, though I still cannot figure out the root cause, my perception is that the insertion in Tnx 1 acquires a gap lock which is somehow overlapping with the gap lock by the select ... for update. And therefore, this create a deadlock where Tnx 1 waits for Tnx 2 and Tnx 2 waits for Tnx 1.

During my research, I found that the right use case for select ... for update is as follows:

1
2
3
4
start transaction;
select * from [table] where [condition] for update;
insert into [table] values [belongs to condition];
delete from [table] where [belongs to condition];

The rows being mutated should be explicitly locked by the select ... for update. Also, the condition should be as clear as possible. For example, put only an unique key in the condition. This is to make the gap lock with a simple and clear range, in order not to cause deadlocks.

Generally, using select ... for update is non-trivial since the underlying locking mechanism seems quite complicated. For my scenario, I got two workarounds:

  1. Disable gap locks by setting the isolation level to READ COMMITTED.
  2. Apply select ... for update on a row from another table, which avoid possible lock overlap.