环境:mysql 5.7/8.0
导入测试数据:
git clone https://github.com/datacharmer/test_db cd test_db mysql -u root -p < employees.sql
employees : 300024 条记录
salaries : 2844047 条记录
1、执行一个两表关联统计SQL,执行速度非常快,整个过程扫描了122行。
mysql> select VARIABLE_VALUE into @a from performance_schema.session_status where variable_name = 'Innodb_rows_read'; Query OK, 1 row affected (0.00 sec) mysql> select e.emp_no,(select max(s.salary) from salaries s where s.emp_no=e.emp_no) from employees e limit 10; +--------+----------------------------------------------------------------+ | emp_no | (select max(s.salary) from salaries s where s.emp_no=e.emp_no) | +--------+----------------------------------------------------------------+ | 10001 | 88958 | | 10002 | 72527 | | 10003 | 43699 | | 10004 | 74057 | | 10005 | 94692 | | 10006 | 60098 | | 10007 | 88070 | | 10008 | 52668 | | 10009 | 94443 | | 10010 | 80324 | +--------+----------------------------------------------------------------+ 10 rows in set (0.00 sec) mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read'; Query OK, 1 row affected (0.00 sec) mysql> select @b-@a; +-------+ | @b-@a | +-------+ | 122 | +-------+ 1 row in set (0.00 sec)
2、将这个关联SQL,做成视图,再次查询会非常慢,实际扫描了 314W 行。
mysql> create view v_test as select e.emp_no,(select max(s.salary) from salaries s where s.emp_no=e.emp_no) from employees e; Query OK, 0 rows affected (0.01 sec) mysql> select VARIABLE_VALUE into @a from performance_schema.session_status where variable_name = 'Innodb_rows_read'; Query OK, 1 row affected (0.00 sec) mysql> select * from v_test limit 10; +--------+----------------------------------------------------------------+ | emp_no | (select max(s.salary) from salaries s where s.emp_no=e.emp_no) | +--------+----------------------------------------------------------------+ | 10001 | 88958 | | 10002 | 72527 | | 10003 | 43699 | | 10004 | 74057 | | 10005 | 94692 | | 10006 | 60098 | | 10007 | 88070 | | 10008 | 52668 | | 10009 | 94443 | | 10010 | 80324 | +--------+----------------------------------------------------------------+ 10 rows in set (1.34 sec) mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read'; Query OK, 1 row affected (0.00 sec) mysql> select @b-@a; +---------+ | @b-@a | +---------+ | 3144071 | +---------+ 1 row in set (0.00 sec)
3、分别查看执行计划
mysql> explain select e.emp_no,(select max(s.salary) from salaries s where s.emp_no=e.emp_no) from employees e limit 10; +----+--------------------+-------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ | 1 | PRIMARY | e | NULL | index | NULL | PRIMARY | 4 | NULL | 299556 | 100.00 | Using index | | 2 | DEPENDENT SUBQUERY | s | NULL | ref | PRIMARY | PRIMARY | 4 | employees.e.emp_no | 9 | 100.00 | NULL | +----+--------------------+-------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ 2 rows in set, 2 warnings (0.00 sec) mysql> explain select * from v_test limit 10; +----+--------------------+------------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+------------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ | 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 299556 | 100.00 | NULL | | 2 | DERIVED | e | NULL | index | NULL | PRIMARY | 4 | NULL | 299556 | 100.00 | Using index | | 3 | DEPENDENT SUBQUERY | s | NULL | ref | PRIMARY | PRIMARY | 4 | employees.e.emp_no | 9 | 100.00 | NULL | +----+--------------------+------------+------------+-------+---------------+---------+---------+--------------------+--------+----------+-------------+ 3 rows in set, 2 warnings (0.00 sec)
4、分析执行计划:
两个执行计划中,唯一不同的是使用视图后,多了一个派生表。
关于派生表说明如下:
https://dev.mysql.com/doc/refman/5.7/en/derived-tables.html
关于派生表官方优化
https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
由于加了limit 10;第一个 SQL 虽然显示 e 表扫描行数很多,但实际并没有进行全表扫描,只统计了前10条记录便停止了。
第二个 SQL 虽然也加了 limit 10,但因为优化器产生了派生表,也就是将统计SQL结果都写入到一个临时表中,再到这个临时表中去读10条记录。
官方虽然有派生表合并优化功能,但对于派生表中包含聚合函数,group by ,having , count ,limit 等,就无法进行优化。
目前解决这种问题,应该只有一个办法 ,就是别用视图。