【SQL】使用分析函数与关联子查询的比较

由于业务需要,需实现如下功能的sql语句:查询出一个表T中每个不同id值所对应的最大val值。
---构造环境。
SQL> insert into t values (1,2,1);
已创建 1 行。
SQL> insert into t values (1,2,3);
已创建 1 行。
SQL> insert into t values (1,3,4);
已创建 1 行。
SQL> insert into t values (1,3,5);
已创建 1 行。
SQL> insert into t values (2,3,5);
已创建 1 行。
SQL> insert into t values (2,3,4);
已创建 1 行。
SQL> insert into t values (2,3,6);
已创建 1 行
SQL> insert into t values (2,3,56);
已创建 1 行。
SQL> commit;
提交完成。
SQL> select * from t;
        ID STATE             VAL 
---------- ---------- ---------- 
         1 2                   1
         1 2                   3
         1 3                   4
         1 3                   5 
         2 3                   5
         2 3                   4
         2 3                   6
         2 3                  56
已选择8行。

-----------------------------------------
SQL> select id,state ,val
  2  from (select id ,state,val, row_number()
  3  over (partition by id order by val desc ) rn from t )
  4  where rn=1;
        ID STATE             VAL
---------- ---------- ----------               
         1 3                   5                             
         2 3                  56  ---结果符合要求

-----查看使用分析函数的执行计划。
SQL> set autot on
SQL> select id,state ,val
  2  from (select id ,state,val, row_number()
  3  over (partition by id order by val desc ) rn from t )
  4  where rn=1;

        ID STATE             VAL                                               
---------- ---------- ----------                                               
         1 3                   5                                               
         2 3                  56                                               
----查询出 id 为1对应的最大值 5 ,id 为2 时对应的最大值56
----使用分析函数的执行计划
执行计划
----------------------------------------------------------                     
Plan hash value: 3047187157                                                                            
--------------------------------------------------------------------------------           
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT         |      |     8 |   368 |     4  (25)| 00:00:01 |                                 
|*  1 |  VIEW                    |      |     8 |   368 |     4  (25)| 00:00:01 |                                 
|*  2 |   WINDOW SORT PUSHED RANK|      |     8 |   264 |     4  (25)| 00:00:01 |                                 
|   3 |    TABLE ACCESS FULL     | T    |     8 |   264 |     3   (0)| 00:00:01 |   

----注意这里使用了 WINDOW SORT PUSHED RANK,只进行了一次FTS
-------------------------------------------------------------------------------- 
Predicate Information (identified by operation id):                            
---------------------------------------------------   
   1 - filter("RN"=1)                                                          
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "ID" ORDER BY                   
              INTERNAL_FUNCTION("VAL") DESC )<=1)                              
Note                                                                           
-----                                                                          
   - dynamic sampling used for this statement                                  
统计信息
----------------------------------------------------------                     
         76  recursive calls                                                   
          0  db block gets                                                     
         21  consistent gets                                                   
          0  physical reads                                                     
                                                                  
             .......                             
          3  sorts (memory)                                                    
          0  sorts (disk)                                                      
          2  rows processed                                                    
SQL> set linesize 999
SQL> /
        ID STATE             VAL                                             
---------- ---------- ----------                                             
         1 3                   5     
         2 3                  56                                                                                  

执行计划
----------------------------------------------------------                                                 
Plan hash value: 3047187157
---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     8 |   368 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                                      |      |     8 |   368 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |     8 |   264 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | T    |     8 |   264 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------                         
Predicate Information (identified by operation id):                          
---------------------------------------------------                                                      
   1 - filter("RN"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "ID" ORDER BY                                        
              INTERNAL_FUNCTION("VAL") DESC )<=1)                                                                                                                                                
Note                                                                   
----                               
   - dynamic sampling used for this statement
统计信息
----------------------------------------------------------                                         
          0  recursive calls                                                   
          0  db block gets                                                  
          7  consistent gets   -------逻辑读为 7                                                                   
          0  physical reads                                                    
          0  redo size                                
        568  bytes sent via SQL*Net to client                    
        416  bytes received via SQL*Net from client                            
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)                                             
          0  sorts (disk)                                               
          2  rows processed
---没有使用分析函数的执行计划,可以看出此计划执行了两次全表扫描(见下面的例子),而使用了分析函数的情况下,只是扫描了一次!
---显然当数据量巨大时,两种方式的资源消耗差别是巨大的。
SQL> select id,state ,val from
  2  t a where a.val in
  3  (select max(b.val) from t b
  4  where b.id =a.id group by id ); ---在业务处理时应该慎用关联子查询!

        ID STATE             VAL                                           
---------- ---------- ----------      
         1 3                   5                                            
         2 3                  56                                                                                  
执行计划
----------------------------------------------------------
Plan hash value: 2845706984                                                                                       
-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |    33 |     6   (0)| 00:00:01 |                                   
|*  1 |  FILTER                |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL    | T    |     8 |   264 |     3   (0)| 00:00:01 |
|*  3 |   FILTER               |      |       |       |            |          |
|   4 |    SORT GROUP BY NOSORT|      |     1 |    26 |     3   (0)| 00:00:01|        

|*  5 |     TABLE ACCESS FULL  | T    |     1 |    26 |     3   (0)| 00:00:01 |      

-------------------------------------------------------------------------------                 
Predicate Information (identified by operation id):        
---------------------------------------------------                                                               
   1 - filter( EXISTS (SELECT 0 FROM "T" "B" WHERE "B"."ID"=:B1 GROUP
              BY "ID" HAVING MAX("B"."VAL")=:B2))
   3 - filter(MAX("B"."VAL")=:B1)
   5 - filter("B"."ID"=:B1)
Note   
----- 
   - dynamic sampling used for this statement
统计信息
----------------------------------------------------------
        296  recursive calls
          0  db block gets
        203  consistent gets  ---逻辑读为 203 是使用分析函数的30倍!
  0  physical reads 
           11  sorts (memory)                   
          0  sorts (disk)                        
          2  rows processed

总结:相关子查询对于外部查询的每一个值都会有一个结果与其对应,其计算的过程是这样的:
1.扫描外查询的第一条记录
2.扫描子查询,并将第一条记录的对应值传给子查询,由此计算出子查询的结果
3.根据子查询的结果,返回外查询的结果。
4.重复上述动作,开始扫描外查询的第二条记录,第三条记录,直至全部扫描完毕。

所以,当使用分析函数能够满足业务需求时,最好使用分析函数。处理大量业务数据时,相关子查询是相当耗费资源的。慎用!

上一篇:Python数据写入csv格式文件


下一篇:Python Django(WEB电商项目构建)