者:赵黎明

爱可生 MySQL DBA 团队成员,Oracle 10g OCM,MySQL 5.7 OCP,擅长数据库性能问题诊断、事务与锁问题的分析等,负责处理客户 MySQL 及我司自研 DMP 平台日常运维中的问题,对开源数据库相关技术非常感兴趣。
本文来源:原创投稿 *爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。

什么是半一致性读?

先看下官方的描述:

  • 是一种用在 Update 语句中的读操作(一致性读)的优化,是在 RC 事务隔离级别下与一致性读的结合。
  • 当 Update 语句的 where 条件中匹配到的记录已经上锁,会再次去 InnoDB 引擎层读取对应的行记录,判断是否真的需要上锁(第一次需要由 InnoDB 先返回一个最新的已提交版本)。
  • 只在 RC 事务隔离级别下或者是设置了 innodb_locks_unsafe_for_binlog=1 的情况下才会发生。
  • innodb_locks_unsafe_for_binlog 参数在 8.0 版本中已被去除(可见,这是一个可能会导致数据不一致的参数,官方也不建议使用了)。

测试案例

InnoDB 引擎的强大之处就在于它能完美地支持事务,而事务的一致性则是由事务隔离级别和并发事务锁来保证的。接下来,我们先通过 2 个测试案例来观察半一致性读会对事务产生哪些影响。

案例 1

  • RC 隔离级别,3 个 Session 执行事务语句

-- 创建测试表root@localhost:mysqld.sock[zlm] <5.7.30-log>create table zlm.t(id int,sal int) engine innodb default character set utf8mb4;Query OK, 0 rows affected (0.06 sec)
root@localhost:mysqld.sock[zlm] <5.7.30-log>show create table zlm.t\G*************************** 1. row *************************** Table: tCreate Table: CREATE TABLE `t` ( `id` int(11) DEFAULT NULL, `sal` int(11) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb41 row in set (0.01 sec)
-- 写入测试数据(创建插入数据的存储过程)root@localhost:mysqld.sock[zlm] <5.7.30-log>drop procedure if exists zlm.proc_t;Query OK, 0 rows affected (0.00 sec)
root@localhost:mysqld.sock[zlm] <5.7.30-log>delimiter $$root@localhost:mysqld.sock[zlm] <5.7.30-log>create procedure zlm.proc_t() -> begin -> declare i int default 1; -> declare j int default 100; -> while i<11 do -> insert into t(id,sal) values(i,j); -> set i=i+1; -> set j=j+100; -> end while; -> end $$Query OK, 0 rows affected (0.01 sec)
root@localhost:mysqld.sock[zlm] <5.7.30-log>delimiter ;root@localhost:mysqld.sock[zlm] <5.7.30-log>select * from t;+------+------+| id | sal |+------+------+| 1 | 100 || 2 | 200 || 3 | 300 || 4 | 400 || 5 | 500 || 6 | 600 || 7 | 700 || 8 | 800 || 9 | 900 || 10 | 1000 |+------+------+10 rows in set (0.00 sec)
-- 开启RC隔离级别root@localhost:mysqld.sock[zlm] <5.7.30-log>set @@global.tx_isolation='read-committed';Query OK, 0 rows affected, 1 warning (0.00 sec)
注意,从 8.0.3 版本开始,去掉了 tx_isolation 参数,参数名只支持 transaction_isolation
-- 开启两个新的Session(设置global参数后,仅对新连接生效)-- 确认Session 1的隔离级别和线程IDroot@localhost:mysqld.sock[(none)] <5.7.30-log>show variables like 'tx_isolation';+---------------+----------------+| Variable_name | Value |+---------------+----------------+| tx_isolation | READ-COMMITTED |+---------------+----------------+1 row in set (0.01 sec)
root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();+-----------------+| connection_id() |+-----------------+| 8 |+-----------------+1 row in set (0.00 sec)
-- Session 1执行当前读的Select语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id>3 and id<6 for update;Query OK, 0 rows affected (0.00 sec)
+------+------+| id | sal |+------+------+| 4 | 400 || 5 | 500 |+------+------+2 rows in set (0.00 sec)
-- 查看加锁详情(需设置参数innodb_status_output_locks=on,否则看不到IX锁)-- 线程8的事务1314,获取到了1个表级插入意向锁IX,2个记录锁,对应id=4,id=5的这两条记录---TRANSACTION 1313, ACTIVE 7 sec2 lock struct(s), heap size 1136, 2 row lock(s)MySQL thread id 8, OS thread handle 140086065960704, query id 277 localhost rootTABLE LOCK table `zlm`.`t` trx id 1313 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1313 lock_mode X locks rec but not gapRecord lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000204; asc ;; 1: len 6; hex 000000000515; asc ;; 2: len 7; hex b0000001240110; asc $ ;; 3: len 4; hex 80000004; asc ;; 4: len 4; hex 80000190; asc ;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000205; asc ;; 1: len 6; hex 000000000516; asc ;; 2: len 7; hex b1000001250110; asc % ;; 3: len 4; hex 80000005; asc ;; 4: len 4; hex 800001f4; asc ;;
-- 确认Session 2的隔离级别和线程IDroot@localhost:mysqld.sock[(none)] <5.7.30-log>show variables like 'tx_isolation';+---------------+----------------+| Variable_name | Value |+---------------+----------------+| tx_isolation | READ-COMMITTED |+---------------+----------------+1 row in set (0.00 sec)
root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();+-----------------+| connection_id() |+-----------------+| 9 |+-----------------+1 row in set (0.00 sec)
-- Session 2执行当前读的Select语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id = 7 for update;Query OK, 0 rows affected (0.00 sec)
ERROR 1205 (HY000): Unknown error 1205 ## 1205表示锁等待超时,这里吐槽下最新GA的5.7.30,遇到错误仅会抛出一个代码,没有错误描述,不方便排查
-- 查看加锁详情-- innodb锁等待超时前,可以看到线程91314事务正在请求并等待1个记录锁,id=4的这条记录---TRANSACTION 1314, ACTIVE 3 sec fetching rowsmysql tables in use 1, locked 1LOCK WAIT 3 lock struct(s), heap size 1136, 1 row lock(s)MySQL thread id 9, OS thread handle 140086065690368, query id 282 localhost root Sending dataselect * from t where id = 7 for update------- TRX HAS BEEN WAITING 3 SEC FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap waitingRecord lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000204; asc ;; 1: len 6; hex 000000000515; asc ;; 2: len 7; hex b0000001240110; asc $ ;; 3: len 4; hex 80000004; asc ;; 4: len 4; hex 80000190; asc ;;
------------------TABLE LOCK table `zlm`.`t` trx id 1314 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gapRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap waitingRecord lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000204; asc ;; 1: len 6; hex 000000000515; asc ;; 2: len 7; hex b0000001240110; asc $ ;; 3: len 4; hex 80000004; asc ;; 4: len 4; hex 80000190; asc ;;
-- innodb锁等待超时后再观察一次,线程9的事务1314的事务仍然没有结束,对t表持有IX锁,并且仍然在等待id=4的行锁释放---TRANSACTION 1314, ACTIVE 453 sec2 lock struct(s), heap size 1136, 1 row lock(s)MySQL thread id 9, OS thread handle 140086065690368, query id 282 localhost rootTABLE LOCK table `zlm`.`t` trx id 1314 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap
-- 确认Session 3的隔离级别和线程ID+---------------+----------------+| Variable_name | Value |+---------------+----------------+| tx_isolation | READ-COMMITTED |+---------------+----------------+1 row in set (0.01 sec)
root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();+-----------------+| connection_id() |+-----------------+| 10 |+-----------------+1 row in set (0.00 sec)
-- Session 3执行Update语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;update t set sal = sal + 1 where id = 7;Query OK, 0 rows affected (0.00 sec)
Query OK, 1 row affected (0.00 sec)

在 Session 1 事务仍然未结束的情况下,Session 3 的事务未被阻塞,可以正常执行。

-- 查看3个语句的执行计划root@localhost:mysqld.sock[zlm] <5.7.30-log>explain select * from t where id>3 and id<6 for update;+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 11.11 | Using where |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+1 row in set, 1 warning (0.00 sec)
root@localhost:mysqld.sock[zlm] <5.7.30-log>explain select * from t where id = 7 for update;+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+1 row in set, 1 warning (0.00 sec)
root@localhost:mysqld.sock[zlm] <5.7.30-log>explain update t set sal=sal+1 where id=7;+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| 1 | UPDATE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+1 row in set (0.00 sec)
意料之中,由于 t 表没有索引,执行计划必然是走全表扫描,也就是每条被读取到的记录,都会上行锁。那为何 Session 1 只锁了id=4,id=5 的这两条,并没有锁全表呢?而同样是请求 id=7 的记录,为何 Session 2 无法获取锁资源,Session 3 却能成功执行?也许大家从上面的锁分析可以很快得到结论,由于 Session 1 只占用了 id=4、id=5 的行锁,那么 Session 3 去请求 id=7 的自然不会有冲突(似乎挺有道理) 那么 Session 2 对 id=7 的请求,为何会被锁定呢?
带着这些疑问,我们继续看第 2 个案例。

案例 2

  • RC 隔离级别,这次 Session 1 执行的 Select 语句不带 where 条件

-- Session 1 执行当前读的 Select 语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t for update;Query OK, 0 rows affected (0.00 sec)
+------+------+| id | sal |+------+------+| 1 | 100 || 2 | 200 || 3 | 300 || 4 | 400 || 5 | 500 || 6 | 600 || 7 | 700 || 8 | 800 || 9 | 900 || 10 | 1000 |+------+------+10 rows in set (0.00 sec)
-- 查看加锁详情-- 线程8的1317事务获得了1个IX表锁和10个X记录锁,即:把表中的10条记录都锁定了-- t表上没有索引,MySQL默认会创建GEN_CLUST_INDEX的聚簇索引,而语句没有加where条件,只能走全表扫描,每条被读取的记录,都要在聚簇索引上加上记录锁(全表记录锁,相当于一个表锁了)---TRANSACTION 1317, ACTIVE 5 sec2 lock struct(s), heap size 1136, 10 row lock(s)MySQL thread id 8, OS thread handle 140086065960704, query id 312 localhost rootTABLE LOCK table `zlm`.`t` trx id 1317 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1317 lock_mode X locks rec but not gapRecord lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000201; asc ;; 1: len 6; hex 00000000050e; asc ;; 2: len 7; hex ab0000011f0110; asc ;; 3: len 4; hex 80000001; asc ;; 4: len 4; hex 80000064; asc d;;
Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000202; asc ;; 1: len 6; hex 00000000050f; asc ;; 2: len 7; hex ac000001200110; asc ;; 3: len 4; hex 80000002; asc ;; 4: len 4; hex 800000c8; asc ;;
Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000203; asc ;; 1: len 6; hex 000000000514; asc ;; 2: len 7; hex af000001230110; asc # ;; 3: len 4; hex 80000003; asc ;; 4: len 4; hex 8000012c; asc ,;;
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000204; asc ;; 1: len 6; hex 000000000515; asc ;; 2: len 7; hex b0000001240110; asc $ ;; 3: len 4; hex 80000004; asc ;; 4: len 4; hex 80000190; asc ;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000205; asc ;; 1: len 6; hex 000000000516; asc ;; 2: len 7; hex b1000001250110; asc % ;; 3: len 4; hex 80000005; asc ;; 4: len 4; hex 800001f4; asc ;;
Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000206; asc ;; 1: len 6; hex 000000000517; asc ;; 2: len 7; hex b2000001260110; asc & ;; 3: len 4; hex 80000006; asc ;; 4: len 4; hex 80000258; asc X;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000207; asc ;; 1: len 6; hex 000000000518; asc ;; 2: len 7; hex b3000001270110; asc ' ;; 3: len 4; hex 80000007; asc ;; 4: len 4; hex 800002bc; asc ;;
Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000208; asc ;; 1: len 6; hex 000000000519; asc ;; 2: len 7; hex b4000001280110; asc ( ;; 3: len 4; hex 80000008; asc ;; 4: len 4; hex 80000320; asc ;;
Record lock, heap no 10 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000209; asc ;; 1: len 6; hex 00000000051a; asc ;; 2: len 7; hex b5000001290110; asc ) ;; 3: len 4; hex 80000009; asc ;; 4: len 4; hex 80000384; asc ;;
Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 00000000020a; asc ;; 1: len 6; hex 00000000051b; asc ;; 2: len 7; hex b60000012a0110; asc * ;; 3: len 4; hex 8000000a; asc ;; 4: len 4; hex 800003e8; asc ;;
-- Session 2执行当前读的Select语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id = 7 for update;Query OK, 0 rows affected (0.00 sec)
ERROR 1205 (HY000): Unknown error 1205
与之前案例 1 相同,也是锁等待超时退出。
-- 查看加锁详情-- 这次线程9的事务1318从第1条记录就开始加锁了---TRANSACTION 1318, ACTIVE 4 sec starting index readmysql tables in use 1, locked 1LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)MySQL thread id 9, OS thread handle 140086065690368, query id 315 localhost root Sending dataselect * from t where id = 7 for update------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1318 lock_mode X locks rec but not gap waitingRecord lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000201; asc ;; 1: len 6; hex 00000000050e; asc ;; 2: len 7; hex ab0000011f0110; asc ;; 3: len 4; hex 80000001; asc ;; 4: len 4; hex 80000064; asc d;;
------------------TABLE LOCK table `zlm`.`t` trx id 1318 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1318 lock_mode X locks rec but not gap waitingRecord lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000201; asc ;; 1: len 6; hex 00000000050e; asc ;; 2: len 7; hex ab0000011f0110; asc ;; 3: len 4; hex 80000001; asc ;; 4: len 4; hex 80000064; asc d;;
-- Session 3执行Update语句root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;update t set sal = sal + 1 where id = 7;Query OK, 0 rows affected (0.00 sec)
ERROR 1205 (HY000): Unknown error 1205
与案例 1 不同的是,这次 Update 语句也遭遇锁等待超时退出了。
-- 查看加锁详情-- 这次Session 3请求id=7的记录锁,该锁被Session 1持有未释放,导致了Session 3锁等待超时---TRANSACTION 1319, ACTIVE 14 sec fetching rowsmysql tables in use 1, locked 1LOCK WAIT 2 lock struct(s), heap size 1136, 8 row lock(s)MySQL thread id 10, OS thread handle 140086066231040, query id 322 localhost root updatingupdate t set sal = sal + 1 where id = 7------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1319 lock_mode X locks rec but not gap waitingRecord lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000207; asc ;; 1: len 6; hex 000000000518; asc ;; 2: len 7; hex b3000001270110; asc ' ;; 3: len 4; hex 80000007; asc ;; 4: len 4; hex 800002bc; asc ;;
------------------TABLE LOCK table `zlm`.`t` trx id 1319 lock mode IXRECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1319 lock_mode X locks rec but not gap waitingRecord lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 6; hex 000000000207; asc ;; 1: len 6; hex 000000000518; asc ;; 2: len 7; hex b3000001270110; asc ' ;; 3: len 4; hex 80000007; asc ;; 4: len 4; hex 800002bc; asc ;;

案例分析

由于 t 表上不存在索引,3 个会话执行的语句都是全表扫描,在 RC 事务隔离级别下,这些语句都是需要发起当前读的操作(读取t表上最新的已提交事务版本),需要对读取到的全部记录加上记录锁(即行锁、也可称为 InnoDB 锁,大多数情况下,RC 隔离级别没有 Gap 锁,因此基本不太会出现 Next-Key 锁,对高并发场景比较友好)。

案例 1

  • Session 1:开始需要对每条记录加锁,由于不需要维护可重复读,也不需要锁 Gap,当返回 MySQL Server 层通过 where 条件过滤后,最终只对 id=4、id=5 的记录加了锁。

  • Session 2:从 id=1 开始读取记录并加锁,当读取到 id=4 的记录时,由于 Session 1 先对 id=4 的记录上了锁,就无法再对其进行加锁操作,我们看到它一直在等待 id=4 的 X 锁,直到锁等待超时报错,为何是 id=4,而不是 id=5?因为是按聚簇索引一条条读取记录的,所以锁也需要一条条加,当上一条记录的锁资源没获取到,就不会对下一条记录加锁。

  • Session 3:同样地,最开始也需要对读取到的记录一条条加锁,由于 id=7 的记录与 id=4、id=5 上的行锁并不冲突,此处可以利用半一致性读对 Update 的优化特性,提前将 id=7 上的行锁释放掉了,因此 Update 不会被阻塞,事务得以正常执行。

案例 2

  • Session 1:Select 语句没有用 where 条件,通过全表扫描访问到的所有记录都无法通过 MySQL Server 层过滤,因此将 t 表的全部记录都上了 X 锁。

  • Session 2:由于 Session 1 已经将全部记录都上了 X 锁,Session 2 当前读的 Select 操作由于无法获取任何记录的 X 锁,就被阻塞了。

  • Session 3:同样地,Session 1 持有的全记录 X 锁,使 Session 3 的 where 条件落到了匹配的区间内,表示 Session 1 对 id=7 的行确实需要更新,必须上锁,因此 Session 3 的 Update 被阻塞。

总结

在 RC 事务隔离级别下,Update 语句可以利用到半一致性读的特性,会多进行一次判断,当 where 条件匹配到的记录与当前持有锁的事务中的记录不冲突时,就会提前释放 InnoDB 锁,虽然这样做违背了二阶段加锁协议,但却可以减少锁冲突,提高事务并发能力,是一种很好的优化行为。

参考链接
https://dev.mysql.com/doc/refman/5.7/en/innodb-consistent-read.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html
https://my.oschina.net/JKOPERA/blog/1929335