[20140210]一条sql语句的优化(11g).txt
今天下午看生产系统数据库,无意中发现一个错误,同时优化也有点小问题,写一个测试脚本。
1.建立测试环境:
SCOTT@test> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table t pctfree 99 as
select rownum id ,mod(rownum,200) idx,trunc(sysdate)-dbms_random.value(0,200) cr_date ,rpad(rownum,1000,'x') vc from dual connect by level
create unique index pk_t on t (id);
create index i_t_idx on t(idx);
create index i_t_cr_date on t(cr_date);
exec dbms_stats.gather_table_stats(null,'T',estimate_percent=>100,no_invalidate=>false,cascade=>true) ;
2.有问题语句:
SCOTT@test> column vc noprint
SCOTT@test> select * from t where cr_date = ( select max(cr_date) from t where idx=42);
ID IDX CR_DATE
---------- ---------- -------------------
7842 42 2014-02-06 09:57:35
--很明显这样查询结果目前是正确的,但是实际上如果我修改如下:
update t set cr_date='2014-02-06 09:57:35' where id=1e4;
commit ;
--注意我这里取了一个巧,直接使用cr_date='2014-02-06 09:57:35',只要nls*相关参数正确,一般没有问题。
--再次执行:
SCOTT@test> select * from t where cr_date = ( select max(cr_date) from t where idx=42);
ID IDX CR_DATE
---------- ---------- -------------------
7842 42 2014-02-06 09:57:35
10000 0 2014-02-06 09:57:35
--很明显,开发需要是IDX=42的记录,而不需要第2条。只不过日期相重的概率很少罢了。BTW,我已经不止一次发现这样类似的错误。
--至少这样写才正确:
SCOTT@test> alter session set statistics_level=all;
SCOTT@test> select * from t where cr_date = ( select max(cr_date) from t where idx=42) and idx=42;
ID IDX CR_DATE
---------- ---------- -------------------
7842 42 2014-02-06 09:57:35
SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 77hz50d71h2s9, child number 0
-------------------------------------
select * from t where cr_date = ( select max(cr_date) from t where
idx=42) and idx=42
Plan hash value: 2661465193
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 32 (100)| 1 |00:00:00.06 | 39 | | | |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 1 | 2 (0)| 1 |00:00:00.06 | 39 | | | |
| 2 | BITMAP CONVERSION TO ROWIDS | | 1 | | | 1 |00:00:00.06 | 38 | | | |
| 3 | BITMAP AND | | 1 | | | 1 |00:00:00.06 | 38 | | | |
| 4 | BITMAP CONVERSION FROM ROWIDS| | 1 | | | 1 |00:00:00.06 | 36 | | | |
|* 5 | INDEX RANGE SCAN | I_T_CR_DATE | 1 | 1 | 1 (0)| 2 |00:00:00.04 | 36 | | | |
| 6 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 34 | | | |
|* 7 | VIEW | index$_join$_002 | 1 | 50 | 30 (4)| 50 |00:00:00.04 | 34 | | | |
|* 8 | HASH JOIN | | 1 | | | 50 |00:00:00.04 | 34 | 1096K| 1096K| 1580K (0)|
|* 9 | INDEX RANGE SCAN | I_T_IDX | 1 | 50 | 1 (0)| 50 |00:00:00.01 | 2 | | | |
| 10 | INDEX FAST FULL SCAN | I_T_CR_DATE | 1 | 50 | 35 (0)| 10000 |00:00:00.01 | 32 | | | |
| 11 | BITMAP CONVERSION FROM ROWIDS| | 1 | | | 1 |00:00:00.01 | 2 | | | |
|* 12 | INDEX RANGE SCAN | I_T_IDX | 1 | 1 | 1 (0)| 50 |00:00:00.01 | 2 | | | |
-------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("CR_DATE"=)
7 - filter("IDX"=42)
8 - access(ROWID=ROWID)
9 - access("IDX"=42)
12 - access("IDX"=42)
--这样写,执行计划看上去很复杂。逻辑读39,实际上我的索引很小,生产系统不会选择INDEX FAST FULL SCAN I_T_CR_DATE 。
--生产系统实际上是这样
SCOTT@test> select /*+ index(t i_t_idx) */ * from t where cr_date = ( select /*+ index(t i_t_idx) */ max(cr_date) from t where idx=42) and idx=42;
ID IDX CR_DATE
---------- ---------- -------------------
7842 42 2014-02-06 09:57:35
SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 60p18ha8myc2j, child number 0
-------------------------------------
select /*+ index(t i_t_idx) */ * from t where cr_date = ( select /*+
index(t i_t_idx) */ max(cr_date) from t where idx=42) and idx=42
Plan hash value: 1551695814
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 102 (100)| 1 |00:00:00.01 | 105 | 59 |
|* 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 1 | 51 (0)| 1 |00:00:00.01 | 105 | 59 |
|* 2 | INDEX RANGE SCAN | I_T_IDX | 1 | 50 | 1 (0)| 50 |00:00:00.01 | 3 | 0 |
| 3 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 52 | 58 |
| 4 | TABLE ACCESS BY INDEX ROWID| T | 1 | 50 | 51 (0)| 50 |00:00:00.01 | 52 | 58 |
|* 5 | INDEX RANGE SCAN | I_T_IDX | 1 | 50 | 1 (0)| 50 |00:00:00.01 | 2 | 0 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CR_DATE"=)
2 - access("IDX"=42)
5 - access("IDX"=42)
--很明显这样逻辑读有点高,到达105.主要的问题要使用索引范围扫描2次(I_T_IDX).
==更正如下:实际这样执行:
SCOTT@test> select /*+ index(t i_t_cr_date) */ * from t where cr_date = ( select /*+ index(t i_t_idx) */ max(cr_date) from t where idx=42) and idx=42;
SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID a8kx1xu0c7cmh, child number 1
-------------------------------------
select /*+ index(t i_t_cr_date) */ * from t where cr_date = ( select
/*+ index(t i_t_idx) */ max(cr_date) from t where idx=42) and idx=42
Plan hash value: 2838288168
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 54 (100)| 1 |00:00:00.01 | 57 | 181 |
|* 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 1 | 3 (0)| 1 |00:00:00.01 | 57 | 181 |
|* 2 | INDEX RANGE SCAN | I_T_CR_DATE | 1 | 1 | 1 (0)| 2 |00:00:00.01 | 55 | 180 |
| 3 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 52 | 164 |
| 4 | TABLE ACCESS BY INDEX ROWID| T | 1 | 50 | 51 (0)| 50 |00:00:00.01 | 52 | 164 |
|* 5 | INDEX RANGE SCAN | I_T_IDX | 1 | 50 | 1 (0)| 50 |00:00:00.01 | 2 | 16 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("IDX"=42)
2 - access("CR_DATE"=)
5 - access("IDX"=42)
===========================
select * from t where (idx,cr_date ) in ( select idx, max(cr_date) from t where idx=42 group by idx ) ;
--也不是很好。
select * from (select * from t where idx=42 order by cr_date desc) where rownum
--建立索引
create index i_t_idx_cr_date on t(idx,cr_date);
SCOTT@test> select * from t where cr_date = ( select max(cr_date) from t where idx=42) and idx=42;
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 77hz50d71h2s9, child number 0
-------------------------------------
select * from t where cr_date = ( select max(cr_date) from t where
idx=42) and idx=42
Plan hash value: 3430031104
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 (100)| 1 |00:00:00.01 | 6 | 1 |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 6 | 1 |
|* 2 | INDEX RANGE SCAN | I_T_IDX_CR_DATE | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 5 | 1 |
| 3 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 2 | 1 |
| 4 | FIRST ROW | | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 2 | 1 |
|* 5 | INDEX RANGE SCAN (MIN/MAX)| I_T_IDX_CR_DATE | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 2 | 1 |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("IDX"=42 AND "CR_DATE"=)
5 - access("IDX"=42)
SCOTT@test> select * from (select * from t where idx=42 order by cr_date desc) where rownumSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 19pf4scday9pv, child number 0
-------------------------------------
select * from (select * from t where idx=42 order by cr_date desc)
where rownumPlan hash value: 2332835607
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 (100)| 1 |00:00:00.01 | 3 |
|* 1 | COUNT STOPKEY | | 1 | | | 1 |00:00:00.01 | 3 |
| 2 | VIEW | | 1 | 2 | 4 (0)| 1 |00:00:00.01 | 3 |
| 3 | TABLE ACCESS BY INDEX ROWID | T | 1 | 50 | 4 (0)| 1 |00:00:00.01 | 3 |
|* 4 | INDEX RANGE SCAN DESCENDING| I_T_IDX_CR_DATE | 1 | 2 | 2 (0)| 1 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM 4 - access("IDX"=42)
--这样写最好,当然前提是不会返回多行,好像实际不会出现这种情况。
--这样逻辑读确实下降不少,但是不是我需要,因为建立的索引有些冗余了。
drop index i_t_idx_cr_date ;
SCOTT@test> select * from (select * from t where idx=42 order by cr_date desc) where rownumSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 19pf4scday9pv, child number 0
-------------------------------------
select * from (select * from t where idx=42 order by cr_date desc)
where rownumPlan hash value: 587900075
--------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 52 (100)| 1 |00:00:00.02 | 52 | | | |
|* 1 | COUNT STOPKEY | | 1 | | | 1 |00:00:00.02 | 52 | | | |
| 2 | VIEW | | 1 | 50 | 52 (2)| 1 |00:00:00.02 | 52 | | | |
|* 3 | SORT ORDER BY STOPKEY | | 1 | 50 | 52 (2)| 1 |00:00:00.02 | 52 | 73728 | 73728 | |
| 4 | TABLE ACCESS BY INDEX ROWID| T | 1 | 50 | 51 (0)| 50 |00:00:00.02 | 52 | | | |
|* 5 | INDEX RANGE SCAN | I_T_IDX | 1 | 50 | 1 (0)| 50 |00:00:00.01 | 2 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM 3 - filter(ROWNUM 5 - access("IDX"=42)
--这样仅仅扫描i_t_idx一次。比原来逻辑读少一半。
总结:
1.修改语句如下:
select * from (select * from t where idx=42 order by cr_date desc) where rownum2.比较好的解决方式是建立idx,cr_date的复合索引。删除idx的索引。