MySQL 关于 reconsidering_access_paths_for_index_ordering

参考

版本 5.7.24

问题复现

CREATE TABLE `test_tbl` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `a` varchar(64) NOT NULL COMMENT '',
      `b` tinyint(4) NOT NULL DEFAULT '1' COMMENT '',
      `c` varchar(64) NOT NULL COMMENT '',
      `gmt_create` datetime DEFAULT CURRENT_TIMESTAMP,
      `gmt_update` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`id`),
      UNIQUE KEY `uk_a_c` (`a`,`c`),
      KEY `idx_a_b` (`a`,`b`)
) ENGINE=InnoDB AUTO_INCREMENT=7011293 DEFAULT CHARSET=utf8
# 数据样本700w，b字段基本没有区分度
# Query 1
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 20
20 rows in set (2.79 sec)
# Query 2
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 50
50 rows in set (0.02 sec)
# Query 3
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl force index(`idx_a_b`) WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 50;
50 rows in set (0.00 sec)

同样的查询当limit大小为20的时候，执行时间较长。当limit大小为50的时候，执行速度更快。当强制指定索引 idx_a_b时，结果更加快。

从Explain执行计划上看到
Query 1: 使用Primary全表扫描，不需要排序。
Query 2: 使用uk_a_c唯一索引，a字段生效，结果需要排序。
Query 3: 使用idx_a_b二级索引，在不考虑回表取数据过程，该索引理论上是最优方案。

Optimizer Trace

对比 `Query 1` 和 `Query 2` 的Optimizer Trace结果

image.png

SQL 优化器对比，reconsidering_access_paths_for_index_ordering 阶段，Query 1 选择了主键扫描避免排序，Query 2选择uk_a_c索引。

从官方了解到后续MySQL可能会提供改阶段行为的开关。
bugs.mysql.com: Dangerous optimization reconsidering_access_paths_for_index_ordering

那选择idx_a_c不就好了吗？为什么Query 2会选择uk_a_c索引？

在前面的considered_execution_plans阶段，uk_a_c cost略小于idx_a_b,一般是由于b字段区分度不高产生。

image.png

我们对样本表数据重新处理，将b字段处理为主键值。

> EXPLAIN SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl_01 WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 20;
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
| id | select_type | table       | partitions | type | possible_keys  | key     | key_len | ref         | rows | filtered | Extra                 |
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
|  1 | SIMPLE      | test_tbl_01 | NULL       | ref  | uk_a_c,idx_a_b | idx_a_b | 195     | const,const |    1 |   100.00 | Using index condition |
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

总结

idx_a_b索引b字段区分度不高，导致优化器选择uk_a_c索引, 从而触发reconsidering_access_paths_for_index_ordering选择PRIMARY。
reconsidering_access_paths_for_index_ordering 会触发主键全表扫描，和以往 order by id limit N（当N特别大时，避免回表查询以及排序，会选择全表扫描）不同，这次是在N较小时，主动选择了全表扫描。
还有一种绕过方案：uk_a_c 顺序调整，即改成 uk_c_a，不让该唯一索引参与SQL优化过程，而是使用到idx_a_b，从而也就避免了reconsidering_access_paths_for_index_ordering 阶段选择PRIMARY。

最后编辑于：2020.01.08 16:06:29

MySQL 关于 reconsidering_access_paths_for_index_ordering

参考

问题复现

Optimizer Trace

对比 Query 1 和 Query 2 的Optimizer Trace结果

那选择idx_a_c不就好了吗？为什么Query 2会选择uk_a_c索引？

总结

对比 `Query 1` 和 `Query 2` 的Optimizer Trace结果