|
背景
业务端遇到报错为"Deadlock found when trying to get lock; try restarting transaction"则表明有死锁发生
名称配置数据库版本GreatSQL 8.0.26隔离级别Read-Commitedinnodb status 日志
- greatsql> show engine innodb status\G
- *************************** 1. row ***************************
- Type: InnoDB
- Name:
- Status:
- =====================================
- 2024-01-28 16:55:38 140737023727360 INNODB MONITOR OUTPUT
- =====================================
- Per second averages calculated from the last 14 seconds
- -----------------
- BACKGROUND THREAD
- -----------------
- srv_master_thread loops: 41 srv_active, 0 srv_shutdown, 17830 srv_idle
- srv_master_thread log flush and writes: 0
- ----------
- SEMAPHORES
- ----------
- -------------
- RW-LATCH INFO
- -------------
- Total number of rw-locks 132361
- OS WAIT ARRAY INFO: reservation count 11180
- OS WAIT ARRAY INFO: signal count 11177
- RW-shared spins 0, rounds 0, OS waits 0
- RW-excl spins 0, rounds 0, OS waits 0
- RW-sx spins 0, rounds 0, OS waits 0
- Spin rounds per wait: 0.00 RW-shared, 0.00 RW-excl, 0.00 RW-sx
- ------------------------
- LATEST DETECTED DEADLOCK
- ------------------------
- 2024-01-28 16:53:40 140735053358848
- *** (1) TRANSACTION:
- TRANSACTION 37616, ACTIVE 8 sec inserting
- mysql tables in use 1, locked 1
- LOCK WAIT 2 lock struct(s), heap size 1192, 1 row lock(s), undo log entries 1
- MySQL thread id 16, OS thread handle 140737023432448, query id 652 127.0.0.1 root update
- insert into info values (50,11)
- *** (1) HOLDS THE LOCK(S):
- RECORD LOCKS space id 26 page no 5 n bits 80 index uk_name of table `apple`.`info` trx id 37616 lock mode S waiting
- Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
- 0: len 4; hex 8000000b; asc ;;
- 1: len 4; hex 80000028; asc (;;
- *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
- RECORD LOCKS space id 26 page no 5 n bits 80 index uk_name of table `apple`.`info` trx id 37616 lock mode S waiting
- Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
- 0: len 4; hex 8000000b; asc ;;
- 1: len 4; hex 80000028; asc (;;
- *** (2) TRANSACTION:
- TRANSACTION 37615, ACTIVE 24 sec inserting
- mysql tables in use 1, locked 1
- LOCK WAIT 3 lock struct(s), heap size 1192, 2 row lock(s), undo log entries 2
- MySQL thread id 15, OS thread handle 140737024022272, query id 653 127.0.0.1 root update
- insert into info values (60,8)
- *** (2) HOLDS THE LOCK(S):
- RECORD LOCKS space id 26 page no 5 n bits 80 index uk_name of table `apple`.`info` trx id 37615 lock_mode X locks rec but not gap
- Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
- 0: len 4; hex 8000000b; asc ;;
- 1: len 4; hex 80000028; asc (;;
- *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
- RECORD LOCKS space id 26 page no 5 n bits 80 index uk_name of table `apple`.`info` trx id 37615 lock_mode X locks gap before rec insert intention waiting
- Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
- 0: len 4; hex 8000000b; asc ;;
- 1: len 4; hex 80000028; asc (;;
- *** WE ROLL BACK TRANSACTION (1)
- ------------
- TRANSACTIONS
- ------------
复制代码 查看表结构
- greatsql> show create table info \G
- *************************** 1. row ***************************
- Table: info
- Create Table: CREATE TABLE `info` (
- `id` int NOT NULL AUTO_INCREMENT,
- `name` int NOT NULL,
- PRIMARY KEY (`id`),
- UNIQUE KEY `uk_name` (`name`)
- ) ENGINE=InnoDB AUTO_INCREMENT=61 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
- 1 row in set (0.01 sec)
复制代码 梳理 innodb status 日志
事务T1T2操作insert into info values (50,11)insert into info values (60,8)关联的对象表apple.info的唯一索引 uk_name表apple.info的唯一索引 uk_name持有的锁lock mode S waitingheap no 7 11,40(十六进制为8,28)lock_mode X locks rec but not gapheap no 7 11,40(十六进制为8,28)等待的锁lock mode S waitingheap no 7 11,40(十六进制为8,28)lock_mode X locks gap before rec insert intention waitingheap no 7 11,40(十六进制为8,28)
- 首先事务T2获取到了uk_name中记录11的 lock x,rec not not gap 锁
- 事务T1尝试获取uk_name中记录11的lock s, next key lock,由于T2持有了记录的独占锁,因此被T1堵塞
- 事务T2尝试获取uk_name中记录11的lock x, gap before rec,insert intention,但被堵塞
获取业务历史SQL语句
通过系统表方式
通过performance_schema.threads、performance_schema.events_statements_history、performance_schema.events_statements_history_long等系统表获取历史SQL
- 根据GreatSQL thread id获得线程id
- greatsql> select PROCESSLIST_ID,THREAD_ID,THREAD_OS_ID from performance_schema.threads where processlist_id in (15,16);
- +----------------+-----------+--------------+
- | PROCESSLIST_ID | THREAD_ID | THREAD_OS_ID |
- +----------------+-----------+--------------+
- | 15 | 61 | 5714 |
- | 16 | 62 | 5719 |
- +----------------+-----------+--------------+
- 2 rows in set (0.00 sec)
复制代码- greatsql> select THREAD_ID,EVENT_ID,CURRENT_SCHEMA,SQL_TEXT,MESSAGE_TEXT,EVENT_NAME,SOURCE from performance_schema.events_statements_history where thread_id in (61,62) order by THREAD_ID,EVENT_ID;
- +-----------+----------+----------------+---------------------------------+--------------------------------------------------------------------+--------------------------+---------------------------------+
- | THREAD_ID | EVENT_ID | CURRENT_SCHEMA | SQL_TEXT | MESSAGE_TEXT | EVENT_NAME | SOURCE |
- +-----------+----------+----------------+---------------------------------+--------------------------------------------------------------------+--------------------------+---------------------------------+
- | 61 | 3762 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 3807 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 3852 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 3897 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 3942 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 3987 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 4032 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 61 | 4077 | apple | begin | NULL | statement/sql/begin | init_net_server_extension.cc:94 |
- | 61 | 4100 | apple | insert into info values (40,11) | NULL | statement/sql/insert | init_net_server_extension.cc:94 |
- | 61 | 4569 | apple | insert into info values (60,8) | NULL | statement/sql/insert | init_net_server_extension.cc:94 |
- | 62 | 3215 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3260 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3305 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3350 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3395 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3440 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3485 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3530 | apple | NULL | NULL | statement/com/Field List | init_net_server_extension.cc:94 |
- | 62 | 3575 | apple | begin | NULL | statement/sql/begin | init_net_server_extension.cc:94 |
- | 62 | 3598 | apple | insert into info values (50,11) | Deadlock found when trying to get lock; try restarting transaction | statement/sql/insert | init_net_server_extension.cc:94 |
- +-----------+----------+----------------+---------------------------------+--------------------------------------------------------------------+--------------------------+---------------------------------+
- 20 rows in set (0.00 sec)
复制代码
- 观察show engine innodb status中的GreatSQL thread id 16和GreatSQL thread id 15
- 通过performance_schema.threads获取THREAD_ID
- 通过performance_schema.events_statements_history获取THREAD_ID执行的历史SQL以及执行时间
最终可复现出如下业务SQL:
事务T1T2语句begin;begin;语句insert into info values (40,11);语句insert into info values (50,11);语句insert into info values (60,8);通过解析binlog
$ mysqlbinlog -vv --base64-output=decode-rows bin.000030
- SET @@SESSION.GTID_NEXT= 'e319a624-b2ce-11ee-9aac-00163e62ca8a:8696'/*!*/;
- # at 10314
- #240128 16:52:35 server id 1024 end_log_pos 10390 CRC32 0x59edb313 Query thread_id=18 exec_time=0 error_code=0
- SET TIMESTAMP=1706431955/*!*/;
- BEGIN
- /*!*/;
- # at 10390
- #240128 16:52:35 server id 1024 end_log_pos 10442 CRC32 0xc03dea61 Table_map: `apple`.`info` mapped to number 370
- # at 10442
- #240128 16:52:35 server id 1024 end_log_pos 10486 CRC32 0x670e0c66 Write_rows: table id 370 flags: STMT_END_F
- ### INSERT INTO `apple`.`info`
- ### SET
- ### @1=30 /* INT meta=0 nullable=0 is_null=0 */
- ### @2=30 /* INT meta=0 nullable=0 is_null=0 */
- # at 10486
- #240128 16:52:35 server id 1024 end_log_pos 10517 CRC32 0xab4e0d89 Xid = 598
- COMMIT/*!*/;
- # at 10517
- #240128 19:22:12 server id 1024 end_log_pos 10596 CRC32 0x4f4cf08e GTID last_committed=30 sequence_number=36 rbr_only=yes original_committed_timestamp=1706440932450590 immediate_commit_timestamp=1706440932450590 transaction_length=378
- /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
- # original_commit_timestamp=1706440932450590 (2024-01-28 19:22:12.450590 CST)
- # immediate_commit_timestamp=1706440932450590 (2024-01-28 19:22:12.450590 CST)
- /*!80001 SET @@session.original_commit_timestamp=1706440932450590*//*!*/;
- /*!80014 SET @@session.original_server_version=80026*//*!*/;
- /*!80014 SET @@session.immediate_server_version=80026*//*!*/;
- SET @@SESSION.GTID_NEXT= 'e319a624-b2ce-11ee-9aac-00163e62ca8a:8697'/*!*/;
- # at 10596
- #240128 16:53:16 server id 1024 end_log_pos 10672 CRC32 0xf222c003 Query thread_id=15 exec_time=0 error_code=0
- SET TIMESTAMP=1706431996/*!*/;
- BEGIN
- /*!*/;
- # at 10672
- #240128 16:53:16 server id 1024 end_log_pos 10724 CRC32 0x20cb8c86 Table_map: `apple`.`info` mapped to number 370
- # at 10724
- #240128 16:53:16 server id 1024 end_log_pos 10768 CRC32 0xd8f53958 Write_rows: table id 370 flags: STMT_END_F
- ### INSERT INTO `apple`.`info`
- ### SET
- ### @1=40 /* INT meta=0 nullable=0 is_null=0 */
- ### @2=11 /* INT meta=0 nullable=0 is_null=0 */
- # at 10768
- #240128 16:53:40 server id 1024 end_log_pos 10820 CRC32 0x23f22580 Table_map: `apple`.`info` mapped to number 370
- # at 10820
- #240128 16:53:40 server id 1024 end_log_pos 10864 CRC32 0x182ecdef Write_rows: table id 370 flags: STMT_END_F
- ### INSERT INTO `apple`.`info`
- ### SET
- ### @1=60 /* INT meta=0 nullable=0 is_null=0 */
- ### @2=8 /* INT meta=0 nullable=0 is_null=0 */
- # at 10864
- #240128 19:22:12 server id 1024 end_log_pos 10895 CRC32 0x57fd1d3c Xid = 650
- COMMIT/*!*/;
- SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
- DELIMITER ;
- # End of log file
- /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
复制代码 根据binlog中部分SET @@SESSION.GTID_NEXT= 'e319a624-b2ce-11ee-9aac-00163e62ca8a:8697'该GTID的事务信息,可恢复T2,但T1执行的语句由于被回滚了,则不会记录到binlog,可开启general log日志获取排查
事务T1T2语句begin;begin;语句insert into info values (40,11);语句insert into info values (50,11);语句insert into info values (60,8);分析死锁
- T1、T2开启了一个事务
- 随后T2执行了插入(40,11)的insert语句:insert into info values (40,11)
- T1执行了插入(50,11)的insert语句:insert into info values (50,11) 进行唯一性冲突检查,尝试获取LOCK_S
- 然后T1所在的连接会将T2中的隐式锁转换为显示锁,此时T2将获取Lock X, Rec_not_gap。由于T2的Lock X, Rec_not_gap与T1的LOCK S不兼容,因此T1被堵塞
- 随后,T2又执行了(60,8)的insert语句:insert into info values (60,8) 由于其插入的唯一索引值是8,因此不存在主键冲突,直接执行乐观插入操作。执行乐观插入时,需要检查其它事务是否堵塞insert操作。其核心是获取待插入记录的下一个值(这里刚好是10),并获取该记录上的所有锁,与需要添加的锁判断是否存在冲突。
- T1持有了记录11的LOCK_S锁与T2的LOCK_X、LOCK_INSERT_INTENTION不兼容,因此T2被T1堵塞
- 死锁形成。
解决
• 适当的减少Unique索引
• 避免插入重复的值(唯一索引所在列)
Enjoy GreatSQL
来源:https://www.cnblogs.com/greatsql/p/18079727
免责声明:由于采集信息均来自互联网,如果侵犯了您的权益,请联系我们【E-Mail:cb@itdo.tech】 我们会及时删除侵权内容,谢谢合作! |
|