MySQL 优化器之Index merge Multi-Range Read MRR与Batched Key Access使用案例详解

2021-07-20 04:46:03

MySQL 优化器之Index merge Multi-Range Read MRR与Batched Key Access介绍

索引合并Index merge

覆盖索引是在索引里即有查询时需要的字段，而在一个表上的查询往往条件会有多个组合且较为复杂，所以很难全部由覆盖索引包含到。那么此时我们就会用到索引合并（index merge）算法。

该算法主要包含三类情况：

算法	应用场景	执行计划
Intersection	and	Using intersect(...)
Union	or	Using or(...)
Sort-Union	带范围查询的or	sort_union(...)

交Intersection算法

EXPLAIN ANALYZE
SELECT *
FROM sakila.payment
WHERE staff_id = 1
AND customer_id = 75;
--结果
-> Filter: ((payment.customer_id = 75) and (payment.staff_id = 1))  (cost=8.12 rows=20) (actual time=2.329..2.637 rows=22 loops=1)

-> Index range scan on payment using intersect(idx_fk_customer_id,idx_fk_staff_id)  (cost=8.12 rows=20) (actual time=2.325..2.617 rows=22 loops=1)

从执行计划里我们不难看出这里用到了索引交(intersect)

而通过explain命令则可以看出是

mysql> EXPLAIN  SELECT * FROM sakila.payment WHERE staff_id = 1 AND customer_id = 75\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: payment
   partitions: NULL
         type: index_merge
possible_keys: idx_fk_staff_id,idx_fk_customer_id
          key: idx_fk_customer_id,idx_fk_staff_id
      key_len: 2,1
          ref: NULL
         rows: 20
     filtered: 100.00
        Extra: Using intersect(idx_fk_customer_id,idx_fk_staff_id); Using where

通过观察该查询和表的结构可以看出

show create table sakila.payment
--结果
CREATE TABLE `payment` (
  `payment_id` smallint unsigned NOT NULL AUTO_INCREMENT,
  `customer_id` smallint unsigned NOT NULL,
  `staff_id` tinyint unsigned NOT NULL,
  `rental_id` int DEFAULT NULL,
  `amount` decimal(5,2) NOT NULL,
  `payment_date` datetime NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`payment_id`) USING BTREE,
  KEY `idx_fk_staff_id` (`staff_id`) USING BTREE,
  KEY `idx_fk_customer_id` (`customer_id`) USING BTREE,
  KEY `fk_payment_rental` (`rental_id`) USING BTREE,
  CONSTRAINT `fk_payment_customer` FOREIGN KEY (`customer_id`) REFERENCES `customer` (`customer_id`) ON DELETE RESTRICT ON UPDATE CASCADE,
  CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`) REFERENCES `rental` (`rental_id`) ON DELETE SET NULL ON UPDATE CASCADE,
  CONSTRAINT `fk_payment_staff` FOREIGN KEY (`staff_id`) REFERENCES `staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=16050 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

字段staff_id 和customer_id 有各自的索引idx_fk_staff_id、idx_fk_customer_id，而我们的查询是等值条件的交集。

同理如下查询也是应用到了index merge的intersection算法。

EXPLAIN
SELECT *
FROM sakila.payment
WHERE payment_id > 10
AND customer_id = 318;

并Union算法

当一个表查询中用到一系列的含OR的等值条件时一般会用到并算法。示例查询见下：

SELECT *
FROM sakila.payment
WHERE staff_id = 1
OR customer_id = 318;

SELECT *
FROM sakila.payment
WHERE payment_id > 15000
OR customer_id = 318;
-- 通过explain或者explain、EXPLAIN ANALYZE()
mysql> EXPLAIN SELECT * FROM sakila.payment WHERE staff_id = 1 OR customer_id = 318\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: payment
   partitions: NULL
         type: index_merge
possible_keys: idx_fk_staff_id,idx_fk_customer_id
          key: idx_fk_staff_id,idx_fk_customer_id
      key_len: 1,2
          ref: NULL
         rows: 8069
     filtered: 100.00
        Extra: Using union(idx_fk_staff_id,idx_fk_customer_id); Using where

mysql> EXPLAIN ANALYZE SELECT * FROM sakila.payment WHERE staff_id = 1 OR customer_id = 318\G
*************************** 1. row ***************************
EXPLAIN: -> Filter: ((sakila.payment.staff_id = 1) or (sakila.payment.customer_id = 318))  (cost=2306.30 rows=8069) (actual time=0.224..103.595 rows=8062 loops=1)
-> Index range scan on payment using union(idx_fk_staff_id,idx_fk_customer_id)  (cost=2306.30 rows=8069) (actual time=0.221..98.988 rows=8062 loops=1)

mysql> EXPLAIN FORMAT=tree SELECT * FROM sakila.payment WHERE staff_id = 1 OR customer_id = 318\G
*************************** 1. row ***************************
EXPLAIN: -> Filter: ((sakila.payment.staff_id = 1) or (sakila.payment.customer_id = 318))  (cost=2116.85 rows=8069)
    -> Index range scan on payment using union(idx_fk_staff_id,idx_fk_customer_id)  (cost=2116.85 rows=8069)

排序并Sort Union算法

跟并算法类似，不过这里用的不是等值条件而是范围，详见如下例子：

-- 查询payment表里customer_id小于30或者rental_id小于10的信息。
mysql> EXPLAIN SELECT * FROM sakila.payment WHERE customer_id < 30 OR rental_id < 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: payment
   partitions: NULL
         type: index_merge
possible_keys: idx_fk_customer_id,fk_payment_rental
          key: idx_fk_customer_id,fk_payment_rental
      key_len: 2,5
          ref: NULL
         rows: 826
     filtered: 100.00
        Extra: Using sort_union(idx_fk_customer_id,fk_payment_rental); Using where

查询payment表里customer_id小于20或者rental_id大于18000的信息。
mysql> EXPLAIN SELECT * FROM sakila.payment WHERE customer_id < 20 OR rental_id >18000\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: payment
   partitions: NULL
         type: index_merge
possible_keys: idx_fk_customer_id,fk_payment_rental
          key: idx_fk_customer_id,fk_payment_rental
      key_len: 2,5
          ref: NULL
         rows: 514
     filtered: 100.00
        Extra: Using sort_union(idx_fk_customer_id,fk_payment_rental); Using where


EXPLAIN ANALYZE
SELECT /*+ NO_INDEX_MERGE(payment) */
*
FROM sakila.payment
WHERE staff_id = 1
AND customer_id = 75;

-> Filter: (payment.staff_id = 1)  (cost=12.30 rows=21) (actual time=0.151..0.179 rows=22 loops=1)
-> Index lookup on payment using idx_fk_customer_id (customer_id=75)  (cost=12.30 rows=41) (actual time=0.149..0.167 rows=41 loops=1)
指定不用index merge的hint。

多范围读MRR

多范围读MRR概念

多范围读即Multi-Range Read (MRR)的主要作用是减少扫描辅助索引时产生的IO随机读。优化器一般先读取索引覆盖的字段，其它字段通过rowid关联到聚集索引而找出。多范围读(Multi-Range Read)优化可用于使用索引的范围扫描和等价连接,它不支持虚拟列上的二级索引。

具体的说当读取到辅助索引里字段后再关联到主键，对主键在内存里进行排序后再顺序的读取其它字段的信息。这样就会减少IO随机读。

InnoDB多范围读优化(MRR)的主要应用于没有覆盖索引的磁盘受限查询（当写比读快很多时，读就即被认为是磁盘受限）。

优化参数

用optimizer_switch的参数来控制是否使用MRR.设置mrr=on时，表示启用MRR优化。

mrr_cost_based表示是否通过cost base的方式来启用MRR。

当mrr=on,mrr_cost_based=on,则表示cost base的方式还选择启用MRR优化,当发现优化后的代价过高时就会不使用该项优化。

当mrr=on,mrr_cost_based=off,则表示总是开启MRR优化。

多访问读示例

mysql> EXPLAIN
SELECT /*+ MRR(city) */
*
FROM world.city
WHERE CountryCode BETWEEN 'AUS' AND 'CHN'\G
**************************** 1. row *****************************
id: 1
select_type: SIMPLE
table: city
partitions: NULL
type: range
possible_keys: CountryCode
key: CountryCode
key_len: 3
ref: NULL
rows: 812
filtered: 100
Extra: Using index condition; Using MRR

批量键访问Batched Key Access

批量键访问概念

批量键值访问（Batched Key Access）简称BKA,该优化操作是块内嵌循环和多范围读的整合。这样非索引联接就可以使用类似连接缓冲区索引联接的方式进行优化，通过使用多范围读取优化来减少随机I/O的数量。

由于批处理键值访问优化主要受益的查询范围相对较窄，而且其他查询的性能可能会下降，因此默认情况下禁用优化。

批量键访问示例

SET SESSION
optimizer_switch
= 'mrr=on,mrr_cost_based=off,batched_key_access=on';

mysql>  EXPLAIN
    -> SELECT /*+ BKA(ci) */
    -> co.Code, co.Name AS Country,
    -> ci.Name AS City
    -> FROM world.country co
    -> INNER JOIN world.city ci
    -> ON ci.CountryCode = co.Code\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: co
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 239
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: ci
   partitions: NULL
         type: ref
possible_keys: CountryCode
          key: CountryCode
      key_len: 12
          ref: world.co.Code
         rows: 17
     filtered: 100.00
        Extra: Using join buffer (Batched Key Access)

In this example, the Batched Key Access is enabled using an optimizer hint for the
join on the city (ci) table using the CountryCode index

上例中通过优化器提示(hint)启用了批量键访问选项,即在关联city表时用到CountryCode索引。

批量键值范围优化主要应用在数据量大的关联中，关联缓存则要相应的设置的多些，一般它对应的参数join_buffer_size要设置成4M或者更大。

码农公寓