I encountered a deadlock recently. Similar questions have been asked on StackOverflow, e.g., this and this. But the answers didn’t really explain why this happens.
The situation is quite easy to reproduce @ Mysql 5.7.17 (also tested on other versions in 5.5 or 5.6):
LATEST DETECTED DEADLOCK ------------------------ 2017-04-06 23:54:03 0x7000057db000 *** (1) TRANSACTION: TRANSACTION 1333, ACTIVE 18 sec starting index read mysql tables in use1, locked1 LOCKWAIT2lockstruct(s), heapsize1136, 1rowlock(s) MySQL threadid5, OS thread handle 123145394155520, queryid62 localhost root Sending data select * fromtestwhere val='pre-lock'forupdate *** (1) WAITING FOR THIS LOCKTO BE GRANTED: RECORD LOCKS spaceid24 page no4 n bits 72indexsearchoftable`test_tnx`.`test` trx id1333 lock_mode X waiting Recordlock, heapno2PHYSICALRECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 7072652d6c6f636b; asc pre-lock;; 1: len 8; hex 8000000000000001; asc ;;
*** (2) TRANSACTION: TRANSACTION 1332, ACTIVE 29 sec inserting mysql tables in use1, locked1 4lockstruct(s), heapsize1136, 4rowlock(s), undolog entries 1 MySQL threadid62, OS thread handle 123145394434048, queryid63 localhost root update insertintotestset val='/a/b/c' *** (2) HOLDS THE LOCK(S): RECORD LOCKS spaceid24 page no4 n bits 72indexsearchoftable`test_tnx`.`test` trx id1332 lock_mode X Recordlock, heapno1PHYSICALRECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;;
Record lock, heapno2PHYSICALRECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 7072652d6c6f636b; asc pre-lock;; 1: len 8; hex 8000000000000001; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 24 page no 4 n bits 72 index search of table `test_tnx`.`test` trx id 1332 lock_mode X locks gap before rec insert intention waiting Recordlock, heapno2PHYSICALRECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 7072652d6c6f636b; asc pre-lock;; 1: len 8; hex 8000000000000001; asc ;;
*** WE ROLL BACK TRANSACTION (1)
My objective is to use select ... for update as a uniqueness check for a following sequence of insertions. I expected that Tnx 2 would wait until Tnx 1 released the lock, and then continue its own business. However, Tnx 2 is rolled back due to deadlock. The innodb status looks quite confusing. Tnx 1 is holding and waiting for the same lock.
After some research, though I still cannot figure out the root cause, my perception is that the insertion in Tnx 1 acquires a gap lock which is somehow overlapping with the gap lock by the select ... for update. And therefore, this create a deadlock where Tnx 1 waits for Tnx 2 and Tnx 2 waits for Tnx 1.
During my research, I found that the right use case for select ... for update is as follows:
1 2 3 4
starttransaction; select * from [table] where [condition] forupdate; insertinto [table] values [belongs to condition]; deletefrom [table] where [belongs to condition];
The rows being mutated should be explicitly locked by the select ... for update. Also, the condition should be as clear as possible. For example, put only an unique key in the condition. This is to make the gap lock with a simple and clear range, in order not to cause deadlocks.
Generally, using select ... for update is non-trivial since the underlying locking mechanism seems quite complicated. For my scenario, I got two workarounds:
Disable gap locks by setting the isolation level to READ COMMITTED.
Apply select ... for update on a row from another table, which avoid possible lock overlap.