准备工作:
MySql(准备)—mysql使用存储过程快速插入百万条数据
1. 覆盖索引
InnoDB中的索引类型
- 主键索引中存储的是索引值和实际的行数据。
- 二级索引中存储的是索引值和主键值。
由下图可知,覆盖索引少了回表操作。
- select 聚簇索引 from 表 where 聚簇索引='xxx'; 【覆盖索引】
- select * from 表 where 聚簇索引='xxx'; 【非覆盖索引】
- select 聚簇索引,二级索引 from 表 where 二级索引='xxx'; 【覆盖索引】
- select * from 表 where 二级索引='xxx'; 【非覆盖索引】
- select 聚簇索引,二级索引 from 表 where 二级索引='xxx' and 普通字段='yyy'; 【非覆盖索引】
覆盖索引的优点:
- 减少IO次数,因为索引条目远小于数据行大小,所以若只需读取索引,那么MySQL就会极大地减少数据访问量。
- 由于InnoDB的聚簇索引,覆盖索引对InnoDB表特别有用。InnoDB的二级索引在叶子节点保存行的主键值,所以如果二级主键能够覆盖查询,则可以避免对主键索引的二次查询。
2. 索引覆盖与延迟关联
数据表:
mysql> show create table `t_user`;
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t_user | CREATE TABLE `t_user` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主键',
`name` varchar(32) DEFAULT '' COMMENT '名字',
`age` int(11) DEFAULT NULL COMMENT '年龄',
`p_id` bigint(20) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`) USING BTREE,
KEY `idx_cox` (`p_id`,`name`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=59 DEFAULT CHARSET=latin1 |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
对sql进行优化,使用延迟关联,先执行嵌套子查询,查询到对应id(适合子查询条件复杂的情况)。
mysql> EXPLAIN select * from t_user where p_id =1 and name like '%ji%';
+----+-------------+--------+------+-----------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+-----------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | t_user | ref | idx_pId,idx_cox | idx_pId | 8 | const | 40 | Using where |
+----+-------------+--------+------+-----------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)
mysql> EXPLAIN select * from t_user u INNER JOIN (select id from t_user where p_id =1 and name like '%ji%' ) t on u.id=t.id;
+----+-------------+------------+--------+-----------------+---------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+-----------------+---------+---------+-------+------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 40 | NULL |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 8 | t.id | 1 | NULL |
| 2 | DERIVED | t_user | ref | idx_pId,idx_cox | idx_cox | 8 | const | 40 | Using where; Using index |
+----+-------------+------------+--------+-----------------+---------+---------+-------+------+--------------------------+
3 rows in set (0.00 sec)
3. 分页查询与延迟关联
-- 开启show profiles;
mysql> set profiling = 1;
Query OK, 0 rows affected
-- t1是百万数据
mysql> select * from t1 order by t1_col limit 10000,10;
+--------+--------+---------+
| id | t1_col | t1_col2 |
+--------+--------+---------+
| 452157 | t1_107 | t11_823 |
| 452366 | t1_107 | t11_601 |
| 452644 | t1_107 | t11_348 |
| 450130 | t1_107 | t11_661 |
| 448710 | t1_107 | t11_149 |
| 451508 | t1_107 | t11_397 |
| 451619 | t1_107 | t11_330 |
| 477700 | t1_107 | t11_256 |
| 476010 | t1_107 | t11_709 |
| 472442 | t1_107 | t11_411 |
+--------+--------+---------+
10 rows in set
-- 使用延迟关联和覆盖索引来优化分页查询
mysql> select * from t1,(select id from t1 order by t1_col limit 10000,10) c where t1.id=c.id;
+------+--------+---------+------+
| id | t1_col | t1_col2 | id |
+------+--------+---------+------+
| 2630 | t1_107 | t11_96 | 2630 |
| 3087 | t1_107 | t11_243 | 3087 |
| 4380 | t1_107 | t11_979 | 4380 |
| 4474 | t1_107 | t11_797 | 4474 |
| 5932 | t1_107 | t11_189 | 5932 |
| 7138 | t1_107 | t11_81 | 7138 |
| 7408 | t1_107 | t11_276 | 7408 |
| 7920 | t1_107 | t11_327 | 7920 |
| 8388 | t1_107 | t11_4 | 8388 |
| 8392 | t1_107 | t11_353 | 8392 |
+------+--------+---------+------+
10 rows in set
--查看效果
mysql> show profiles;
+----------+------------+----------------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+----------------------------------------------------------------------------------------+
| 1 | 2.620051 | select * from t1 order by t1_col limit 10000,10 |
| 2 | 0.01078775 | select * from t1,(select id from t1 order by t1_col limit 10000,10) c where t1.id=c.id |
+----------+------------+----------------------------------------------------------------------------------------+
2 rows in set
在《高性能mysql》中是这样描述的:
深分页的其他优化方案:
-
产品维度对分页游标进行约束,例如Google
- 修改接口,每次返回scroll(即当前滚动id)。但不支持页码的跳转:select * from table where id>滚动id limit 0,200
总结:
覆盖索引:select的数据列只用从索引中就能够得到,不用回表查询。
回表查询:查询聚簇索引树的叶子节点中的数据行。
延迟关联:延迟对列的访问,查询的第一阶段使用覆盖索引。而后获取到查询值。
推荐阅读
Using index condition 和 Using where;Using index 的区别