第23期：索引设计（组合索引不适用场景改造）

上篇文章已经详细介绍 MySQL 组合索引的概念以及其适用场景，这篇主要介绍 MySQL 组合索引的不适用场景以及改造方案。

回顾下组合索引的语法和必备条件

【组合索引的语法】（只讨论默认升序排列）

alter table t1 add idx_multi(f1, f2, f3);

【必备条件】列 f1 必须存在于 SQL 语句过滤条件中！

也就是说组合索引的第一个列（最左列）在过滤条件中必须存在，而且最好是等值过滤。看看下面这些 SQL，没有一款适合组合索引。

# SQL 1
select * from t1 where f2 = 1;

# SQL 2
select * from t1 where f3 = 1;

# SQL 3
select * from t1 where f2 = 1 and f3 = 1;

# SQL 4
select * from t1 where f2 = 1 or f3 = 1;

# SQL 5
select * from t1 where f1 = 1 or f2 = 1;

# SQL 6
select * from t1 where f1 = 1 or f3 = 1;

# SQL 7
select * from t1 where f1 = 1 or f2 = 1 or f3 = 1;

那接下来对上面的 SQL 一个一个分析：

SQL 1 过滤条件只有一列 f2，是组合索引 idx_multi 的第二列，由于通过列 f2 找不到索引的入口，没法直接使用索引，针对这样的语句只有在列 f2 上建一个单值索引。

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_f2(f2);
Query OK, 0 rows affected (0.32 sec)
Records: 0  Duplicates: 0  Warnings: 0

加索引前后的执行计划对比，

加索引 idx_f2 之前，全表扫描，没有任何可用的索引。

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f2 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 32194
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

加了索引 idx_f2 之后，直接通过索引 idx_f2 来过滤记录，

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f2 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ref
possible_keys: idx_f2
          key: idx_f2
      key_len: 5
          ref: const
         rows: 351
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

SQL 2 和 SQL 1 一样，可以对列 f3 加一个单值索引。

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_f3(f3);
Query OK, 0 rows affected (0.35 sec)
Records: 0  Duplicates: 0  Warnings: 0

接下来看 SQL 3，SQL 3 的过滤条件为列 (f2, f3)，并且为等值过滤。也不符合组合索引 idx_multi 的必备特征，列 f1 没有被包含到过滤条件里。不过之前对列 f2 和列 f3 分别建立了单值索引，可以利用 MySQL 的 Index merge 特性来合并这两个索引取交集，查询计划如下：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f2 = 1 and f3 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_f2,idx_f3
          key: idx_f3,idx_f2
      key_len: 5,5
          ref: NULL
         rows: 3
     filtered: 100.00
        Extra: Using intersect(idx_f3,idx_f2); Using where
1 row in set, 1 warning (0.00 sec)

可以看到，MySQL 用了 Index_merge 来过滤数据。不过如果这样的语句出现很频繁的话，建议还是加一个仅包含列 f2,f3 的组合索引。

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_multi_sub(f2,f3);
Query OK, 0 rows affected (0.50 sec)
Records: 0  Duplicates: 0  Warnings: 0

执行计划如下：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f2 = 1 and f3 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ref
possible_keys: idx_f2,idx_f3,idx_multi_sub
          key: idx_multi_sub
      key_len: 10
          ref: const,const
         rows: 2
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

可以看到，MySQL 选择了索引 idx_multi_sub 来过滤数据，而且 type 类型为 ref 相比 index_merge 来的更优化。

再来看看 SQL 4，SQL 4 的过滤条件也是列 f2,f3，不过两列的关系不是 AND，而是 OR，并且也不包含列 f1，那这样完全不符合组合索引的特征。

基于这样的语句，如果给定的过滤条件 f2, f3 过滤性能还可以的话(之后我会重新开篇讲索引的过滤性能)，完全可以给两列分别建单值索引，MySQL 会对这样的查询用 index_merge 对两个单值索引做一个合并过滤，之前已经建过了，看看执行计划：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f2 = 1 or f3 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_f2,idx_f3,idx_multi_sub
          key: idx_f2,idx_f3
      key_len: 5,5
          ref: NULL
         rows: 684
     filtered: 100.00
        Extra: Using union(idx_f2,idx_f3); Using where
1 row in set, 1 warning (0.00 sec)

有人说，把 OR 改成 UNION ALL 可以避免这种操作，不过也要分场合，比如上面这条 SQL，index_merge 只是简单对索引 idx_f2、idx_f3 做了一个 union ，没有排序，这时候 OR 和 UNION ALL 性能差不了多少。至于更多的 OR 和 UNION ALL 的优化会另外开篇，不在本篇范围之内，就不多做介绍了。

SQL 5、SQL 6 这两条 SQL 完全可以用到组合索引 idx_multi，因为列 f1 包含在过滤条件里，不过由于其他的条件不是 and 而是 or，所以也需要给 or 后的列加单值索引，这样 MySQL 会把索引 idx_multi 和其他单值索引做一个 index_merge 来优化。来分别看看这两条 SQL 的执行计划：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f1 = 1 or f2 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_multi,idx_f2,idx_multi_sub
          key: idx_multi,idx_f2
      key_len: 5,5
          ref: NULL
         rows: 663
     filtered: 100.00
        Extra: Using sort_union(idx_multi,idx_f2); Using where
1 row in set, 1 warning (0.01 sec)

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f1 = 1 or f3 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_multi,idx_f3
          key: idx_multi,idx_f3
      key_len: 5,5
          ref: NULL
         rows: 645
     filtered: 100.00
        Extra: Using sort_union(idx_multi,idx_f3); Using where
1 row in set, 1 warning (0.00 sec)

可以看到，MySQL同时用到了索引 idx_multi、idx_f2 和 idx_multi、idx_f3。但是两个执行计划的 index_merge 都用 sort_union ，能不能把 sort_union 变为普通的非排序的 union 呢？这时候就要给列 f1 加单值索引，

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_f1(f1);
Query OK, 0 rows affected (0.27 sec)
Records: 0  Duplicates: 0  Warnings: 0

来随机看下 SQL 5 的执行计划：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f1 = 1 or f2 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_multi,idx_f2,idx_multi_sub,idx_f1
          key: idx_f1,idx_f2
      key_len: 5,5
          ref: NULL
         rows: 663
     filtered: 100.00
        Extra: Using union(idx_f1,idx_f2); Using where
1 row in set, 1 warning (0.00 sec)

给列 f1 加了单列索引后，index_merge 就避免了sort_union。

SQL 7 和上面两条 SQL 一样，得单独用到列 f1 的单值索引，不同的是 index_merge 是对三个列进行而不是两个列。执行计划如下：

(127.0.0.1:3400)|(ytt)>explain
    -> select * from t1 where f1 = 1 or f2 = 1 or f3 = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index_merge
possible_keys: idx_multi,idx_f2,idx_f3,idx_multi_sub,idx_f1
          key: idx_f1,idx_f2,idx_f3
      key_len: 5,5,5
          ref: NULL
         rows: 996
     filtered: 100.00
        Extra: Using union(idx_f1,idx_f2,idx_f3); Using where
1 row in set, 1 warning (0.00 sec)

可以看到可以很清晰的看到，MySQL 用了三个单值索引的 index_merge。

其实到现在表 t1 已经有很多索引了，

(127.0.0.1:3400)|(ytt)>show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int NOT NULL,
  `f1` int DEFAULT NULL,
  `f2` int DEFAULT NULL,
  `f3` int DEFAULT NULL,
  `f4` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_multi` (`f1`,`f2`,`f3`),
  KEY `idx_f2` (`f2`),
  KEY `idx_f3` (`f3`),
  KEY `idx_multi_sub` (`f2`,`f3`),
  KEY `idx_f1` (`f1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

从以上表结构可以看到，对于我们这 7 条 SQL，除了主键，还多建了 5 个二级索引。实际场景中也不是非得把这些索引全给建上，主要还得看这条语句的运行频率，如果以上哪条 SQL 运行频率较低并且允许时间与业务高峰期不冲突，可以移除这条 SQL 对应的索引来减少索引对表写入的延迟。

文章推荐：

第22期：索引设计（组合索引适用场景）

第21期：索引设计（函数索引）

第20期：索引设计（前缀索引）

关于 MySQL 的技术内容，你们还有什么想知道的吗？赶紧留言告诉小编吧！

第23期：索引设计（组合索引不适用场景改造）

于3月 10, 20213月 10, 2021由杨涛涛发布

回顾下组合索引的语法和必备条件

MySQL 调优

第47期：EXPLAIN TYPE 列的 JOIN 常见场景详解（上）

MySQL 调优

第46期：EXPLAIN：解说一条简单 SQL 语句的执行计划

MySQL 调优

第45期：一条 SQL 语句优化的基本思路

第23期：索引设计（组合索引不适用场景改造）

于3月 10, 20213月 10, 2021由杨涛涛发布

回顾下组合索引的语法和必备条件

相关文章

MySQL 调优

第47期：EXPLAIN TYPE 列的 JOIN 常见场景详解（上）

MySQL 调优

第46期：EXPLAIN：解说一条简单 SQL 语句的执行计划

MySQL 调优

第45期：一条 SQL 语句优化的基本思路