引用外部文章:
死锁-产生原因和解决方法
今日查看线上日志时无意中发现了一些mysql的错误日志:
org.springframework.dao.DeadlockLoserDataAccessException:
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
### The error may involve com.*.*.*.mapper.*Mapper.doSth-Inline
### The error occurred while setting parameters
### SQL: update *** set *** where ***
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
; SQL []; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:263)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
看了一下,提示很明显是死锁的报错。
Deadlock found when trying to get lock; try restarting transaction
下面记录一下查询及解决此次死锁问题的过程:
- 在数据库执行
show engine innodb status;
查询搜索引擎状态。
=====================================
2020-11-08 11:13:14 2b9b779d1700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 52 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 76615811 srv_active, 0 srv_shutdown, 85780 srv_idle
srv_master_thread log flush and writes: 76701591
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 21660336
OS WAIT ARRAY INFO: signal count 1104381917
Mutex spin waits 3182212742, rounds 2905091141, OS waits 4181214
RW-shared spins 305887541, rounds 1471583934, OS waits 3836427
RW-excl spins 198637096, rounds 3052944917, OS waits 13540134
Spin rounds per wait: -2.61 mutex, 4.81 RW-shared, 15.37 RW-excl
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-11-08 00:46:10 2b9b89e89700
*** (1) TRANSACTION:
TRANSACTION 3951254229, ACTIVE 0.205 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 11 lock struct(s), heap size 1184, 8 row lock(s), undo log entries 7
LOCK BLOCKING MySQL thread id: 15676758 block 15688650
MySQL thread id 15688650, OS thread handle 0x2b9b8b440700, query id 29912949200 IP地址 数据库用户名 updating
update t_table1_test
set
*
where id = 1604641
and *
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 3694 page no 23003 n bits 112 index `PRIMARY` of table db.t_table1_test trx id 3951254229 lock_mode X locks rec but not gap waiting
Record lock, heap no 17 PHYSICAL RECORD: n_fields 48; compact format; info bits 0
0: len 8; hex 8000000000187c21; asc |!;;
1: len 6; hex 0000eb835b16; asc [ ;;
2: len 7; hex 59000005382456; asc Y 8$V;;
3: len 14; hex 3938323032303131303730303334; asc 98202011070034;;
4: len 0; hex ; asc ;;
5: len 8; hex 801d23af8445f051; asc # E Q;;
6: len 8; hex 8000000000000334; asc 4;;
7: len 8; hex 800000000034393c; asc 49<;;
8: len 0; hex ; asc ;;
9: len 30; hex e9b29ce5b8a6e79aaee4ba94e88ab1e8828920e8a7a3e586bbe5938120e6; asc ; (total 32 bytes);
10: len 9; hex e586bbe58c96e9b29c; asc ;;
11: len 0; hex ; asc ;;
12: len 9; hex e4bbbd2835e696a429; asc (5 );;
13: len 1; hex 82; asc ;;
14: len 3; hex e696a4; asc ;;
15: len 5; hex 8000000000; asc ;;
16: len 3; hex e696a4; asc ;;
17: len 1; hex 81; asc ;;
18: len 4; hex 80000002; asc ;;
19: len 6; hex e88289e7b1bb; asc ;;
20: len 4; hex 80000079; asc y;;
21: len 15; hex e78caae88289e58886e589b2e7b1bb; asc ;;
22: len 1; hex 80; asc ;;
23: len 4; hex 80000001; asc ;;
24: len 1; hex 82; asc ;;
25: len 8; hex 800000000098b5ad; asc ;;
26: len 2; hex 4459; asc DY;;
27: len 5; hex 8000014f00; asc O ;;
28: len 4; hex 8000014f; asc O;;
29: len 4; hex 80000026; asc &;;
30: len 4; hex 80000019; asc ;;
31: len 5; hex 800000d35f; asc _;;
32: len 4; hex 8000002d; asc -;;
33: len 4; hex 80000006; asc ;;
34: len 5; hex 8000002c04; asc , ;;
35: len 5; hex 8000000000; asc ;;
36: len 5; hex 8000000000; asc ;;
37: len 8; hex 8000000000000001; asc ;;
38: len 18; hex e7be8ee88f9ce887aae890a5e8b4a7e4b8bb; asc ;;
39: len 4; hex 80000000; asc ;;
40: len 30; hex 3832305f315f34323735325f315f305f313630343736343830305f305f31; asc 820_1_42752_1_0_1604764800_0_1; (total 38 bytes);
41: len 8; hex 8000000000000000; asc ;;
42: len 4; hex 80000000; asc ;;
43: len 4; hex dfa5f927; asc ;;
44: len 8; hex 8000000000000000; asc ;;
45: len 4; hex dfa6cf52; asc R;;
46: len 8; hex 8000000000066246; asc bF;;
47: len 1; hex 80; asc ;;
*** (2) TRANSACTION:
TRANSACTION 3951254294, ACTIVE 0.001 sec fetching rows
mysql tables in use 1, locked 1
7 lock struct(s), heap size 1184, 4 row lock(s), undo log entries 1
MySQL thread id 15676758, OS thread handle 0x2b9b89e89700, query id 29912949199 IP地址 数据库用户名 Searching rows for update
update t_table2_test
set *
where *
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 3694 page no 23003 n bits 112 index `PRIMARY` of table db.t_table1_test trx id 3951254294 lock_mode X locks rec but not gap
Record lock, heap no 17 PHYSICAL RECORD: n_fields 48; compact format; info bits 0
0: len 8; hex 8000000000187c21; asc |!;;
1: len 6; hex 0000eb835b16; asc [ ;;
2: len 7; hex 59000005382456; asc Y 8$V;;
3: len 14; hex 3938323032303131303730303334; asc 98202011070034;;
4: len 0; hex ; asc ;;
5: len 8; hex 801d23af8445f051; asc # E Q;;
6: len 8; hex 8000000000000334; asc 4;;
7: len 8; hex 800000000034393c; asc 49<;;
8: len 0; hex ; asc ;;
9: len 30; hex e9b29ce5b8a6e79aaee4ba94e88ab1e8828920e8a7a3e586bbe5938120e6; asc ; (total 32 bytes);
10: len 9; hex e586bbe58c96e9b29c; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 3695 page no 23978 n bits 136 index `PRIMARY` of table db.t_table2_test trx id 3951254294 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 80000000001c08eb; asc ;;
1: len 6; hex 0000eb835ad5; asc Z ;;
2: len 7; hex 38000002030dd7; asc 8 ;;
3: len 8; hex 8000000000187c24; asc |$;;
4: len 14; hex 3938323032303131303730303334; asc 98202011070034;;
5: len 8; hex 8000000000000334; asc 4;;
6: len 8; hex 80000000003996ac; asc 9 ;;
7: len 8; hex 80000000004516d2; asc E ;;
8: len 10; hex e4bbbd283130e696a429; asc (10 );;
9: len 4; hex 80000001; asc ;;
10: len 5; hex 8000007800; asc x ;;
*** WE ROLL BACK TRANSACTION (2)
------------
TRANSACTIONS
------------
Trx id counter 3953292095
Purge done for trx s n:o < 3953292094 undo n:o < 0 state: running but idle
History list length 2103
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 329424057858152, not started
MySQL thread id 15718908, OS thread handle 0x2b9b779d1700, query id 29932757139 IP地址 数据库用户名 init
show engine innodb status
---TRANSACTION 329423667869800, not started
MySQL thread id 15718839, OS thread handle 0x2b9bc5440700, query id 29932756275 IP地址 数据库用户名 cleaning up
---TRANSACTION 329423157233768, not started
MySQL thread id 15718566, OS thread handle 0x2b9b9f881700, query id 29932754328 IP地址 数据库用户名 cleaning up
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
353973402 OS file reads, 10709125289 OS file writes, 5601652403 OS fsyncs
0.04 reads/s, 16384 avg bytes/read, 0.71 writes/s, 0.71 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 13478, seg size 13480, 46216789 merges
merged operations:
insert 78439264, delete mark 42223185, delete 9152304
discarded operations:
insert 0, delete mark 0, delete 0
AHI PARTITION 1: Hash table size 3187567, node heap has 8016 buffer(s)
AHI PARTITION 2: Hash table size 3187567, node heap has 963 buffer(s)
5858.75 hash searches/s, 346.57 non-hash searches/s
---
LOG
---
Log sequence number 5457632042463
Log flushed up to 5457632042463
Pages flushed up to 5457612529810
Last checkpoint at 5457612529810
0 pending log flushes, 0 pending chkp writes
2110473201 log i/o's done, 0.71 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 13218349056; in additional pool allocated 0
Dictionary memory allocated 7727246
Buffer pool size 786432
Free buffers 8192
Database pages 749700
Old database pages 276582
Modified db pages 9266
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 684364474, not young 13398017116
0.21 youngs/s, 0.06 non-youngs/s
Pages read 457565483, created 140335235, written 10750192913
0.04 reads/s, 0.02 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 749700, unzip_LRU len: 0
I/O sum[16]:cur[0], unzip sum[0]:cur[0]
----------------------
INDIVIDUAL BUFFER POOL INFO
----------------------
---BUFFER POOL 0
Buffer pool size 98304
Free buffers 1024
Database pages 93708
Old database pages 34571
Modified db pages 1343
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 85156668, not young 1683490473
0.04 youngs/s, 0.00 non-youngs/s
Pages read 56859724, created 17529002, written 1631421964
0.00 reads/s, 0.02 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 93708, unzip_LRU len: 0
I/O sum[2]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 1
Buffer pool size 98304
Free buffers 1024
Database pages 93661
Old database pages 34554
Modified db pages 937
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 85205775, not young 1689025718
0.04 youngs/s, 0.00 non-youngs/s
Pages read 57836940, created 17536735, written 1131248341
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 93661, unzip_LRU len: 0
I/O sum[2]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
99 read views open inside InnoDB
Main thread process no. 33567, id 47947005839104, state: sleeping
Number of rows inserted 3618841249, updated 4181423572, deleted 533785971, read 13318808006726
0.29 inserts/s, 0.75 updates/s, 0.00 deletes/s, 6144.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
- 通过分析status状态日志可以清晰的分析出两个事务之间 互相持有锁以及等待获取锁 而产生了死锁的情况。
问题的原因是两种线上业务操作在某个时段大量同时执行,使得事务T1和事务T2在此时段同时执行,两个事务都是操作两张表,但操作表的顺序不一样,事务T1 是table1-table2,事务T2 是 table2-table1,所以当相关业务的两个事务大量同时并行执行时,如果恰巧又操作同一条数据,则出现死锁等待的情况,当事务超时,系统自动回滚其中一个事务,放行另一事务。 - 若干不及时解决,线上出现大量死锁,则客户端的现象就是系统卡顿,无法满足业务需求。
- 解决方法:把两个事务执行表的顺序改成统一顺序,即可解决。
- 实际线上业务出现死锁的情况有很多,此次出现的问题属于易于发现和解决的情况,很多情况下会比这复杂的多,需要根据实际情况分析解决。
可以参考:
死锁-产生原因和解决方法