之前写的《mysql B+Tree索引的一点理解》一文中,介绍了MySQL在使用辅助索引的原理,通过辅助索引进行回表不难理解就相当于Oracle的index skip scan.但是mysql5.6版本中推出了mrr功能,其实就是将随机访问的数据,通过内部机制缓存到线程内存read_rnd_buffer_size中,然后进行排序,排序后的数据再访问主键索引,将随机访问改变为了顺序访问。
一:优点
1.磁盘和磁头不再需要来回做机械运动
如果没有这个功能,那么每获取一个辅助索引的叶子块就会遍历一下主键,找到对应的数据--该过程我们又称为回表。
mrr功能,将这些辅助索引扫描后的数据同一进行缓存,然后一次性访问主键索引,然后找到对应的数据,这样就大大减少了访问数据块的数量
2.可以充分利用磁盘预读
mysql数据库有一个预读功能,也就是访问一个页的数据时,将临近页也会加载到内存中,刚好需要下一页的数据时就不再需要进行物理IO
二:案例演示
说明:本测试在mysql 5.7.35中进行测试。
1.表结构
Create Table: CREATE TABLE `salaries` (
`emp_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`emp_no`,`from_date`),
KEY `idx_salaries_salary` (`salary`),
CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
2.优化器参数
root@localhost [employees]>show variables like '%optimizer_switch%'\G;
*************************** 1. row ***************************
Variable_name: optimizer_switch
Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,prefer_ordering_index=on
1 row in set (0.00 sec)
3.查询SQL并查看执行计划
从这里并没有发现该执行步骤使用了mrr功能,还是每行检索之后访问主键索引,然后进行回表
root@localhost [employees]>explain select * from salaries where salary>10000 and salary<40000;
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+-----------------------+
| 1 | SIMPLE | salaries | NULL | range | idx_salaries_salary | idx_salaries_salary | 4 | NULL | 21450 | 100.00 | Using index condition |
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+-----------------------+
4.关闭MySQL成本控制
root@localhost [employees]>set optimizer_switch='mrr_cost_based=off';
Query OK, 0 rows affected (0.00 sec)
root@localhost [employees]>show variables like '%optimizer_switch%'\G;
*************************** 1. row ***************************
Variable_name: optimizer_switch
Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=off,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,prefer_ordering_index=on
1 row in set (0.00 sec)
5.再次执行查看执行计划
这时我们发现,执行计划已经使用了mrr功能,对辅助索引数据进行缓存之后,一次回表,
root@localhost [employees]>explain select * from salaries where salary>10000 and salary<40000;
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+----------------------------------+
| 1 | SIMPLE | salaries | NULL | range | idx_salaries_salary | idx_salaries_salary | 4 | NULL | 21450 | 100.00 | Using index condition; Using MRR |
+----+-------------+----------+------------+-------+---------------------+---------------------+---------+------+-------+----------+----------------------------------+
但是上面基于成本MySQL为什么没有使用这种方式呢?
显然上面回表效率是高效的,但是MySQL优化器对于MRR功能又是相当的悲观。还是尽可能的选择索引扫描回表。这是我们需要注意的地方