oracle 的自连接(self join)操作
对于oracle数据库来说,目前支持两套语法,一套是oracle自己的sql语法,一套是通行标准的SQL99语法,那么对于oracle的连接操作
来说,也完全可以使用这样的两套语法来分别的实现。当然从效率上来说,两者是没有差别的。只不过从我的角度来讲,oracle的语法更加简洁而已。
比如说我们有一张表emp,表里数据如下
scott@DB01> conn scott/tiger
Connected.
scott@DB01>
scott@DB01> set pagesize 100
scott@DB01> select * from emp;
EMPNO
ENAME
JOB
MGR
HIREDATE
SAL
COMM
DEPTNO
---------- ---------- --------- ---------- ------------ ----------
---------- ----------
7369
SMITH
CLERK
7902
17-DEC-80
800
20
7499
ALLEN
SALESMAN
7698
20-FEB-81
1600
300
30
7521
WARD
SALESMAN
7698
22-FEB-81
1250
500
30
7566
JONES
MANAGER
7839
02-APR-81
2975
20
7654
MARTIN
SALESMAN
7698
28-SEP-81
1250
1400
30
7698
BLAKE
MANAGER
7839
01-MAY-81
2850
30
7782
CLARK
MANAGER
7839
09-JUN-81
2450
10
7788
SCOTT
ANALYST
7566
19-APR-87
3000
20
7839
KING
PRESIDENT
17-NOV-81
5000
10
7844
TURNER
SALESMAN
7698
08-SEP-81
1500
0
30
7876
ADAMS
CLERK
7788
23-MAY-87
1100
20
7900
JAMES
CLERK
7698
03-DEC-81
950
30
7902
FORD
ANALYST
7566
03-DEC-81
3000
20
7934
MILLER
CLERK
7782
23-JAN-82
1300
10
14 rows selected.
通过观察发现,在表中数据库有从属关系,比如7369
smith的经理是7902,7902也是一个员工,名称是ford,他的经理是7566
jones,以此类推,我们最终可以找到公司的boss 7839 king。
现在我的需求是,找到公司里的上下级对应关系,我们分别用oracle的语法和sql99的语法来实现,我们可以得到同样的结果
oracle语法:我们注意到from后面
表emp出现两次,并且使用了不同的别名,也就是表自己和自己来关联,所以我们称这样的连接为自连接操作。当然了,以下的例子中使用了外连接操作,我以后会单独的说明。
scott@DB01> select w.ename||' report for ' ||m.ename
"Relations"
2
from emp w,emp m
3
where w.empno=m.empno(+);
Relations
--------------------------------
SMITH report for SMITH
ALLEN report for ALLEN
WARD report for WARD
JONES report for JONES
MARTIN report for MARTIN
BLAKE report for BLAKE
CLARK report for CLARK
SCOTT report for SCOTT
KING report for KING
TURNER report for TURNER
ADAMS report for ADAMS
JAMES report for JAMES
FORD report for FORD
MILLER report for MILLER
14 rows selected.
sql99语法:join on 语法的实现
scott@DB01> select w.ename||' report for ' ||m.ename
"Relations"
2
from emp w left outer join emp m
3
on (w.mgr=m.empno);
Relations
--------------------------------
SMITH report for FORD
ALLEN report for BLAKE
WARD report for BLAKE
JONES report for KING
MARTIN report for BLAKE
BLAKE report for KING
CLARK report for KING
SCOTT report for JONES
KING report for
TURNER report for BLAKE
ADAMS report for SCOTT
JAMES report for BLAKE
FORD report for JONES
MILLER report for CLARK
14 rows selected.
其实,oracle对于处理表中有从属关系的记录,提供了一种查询方法,我们称之为层次查询,来看一个例子
scott@DB01> select ename ||' report for '|| prior
ename
2
from emp
3
start with
empno=7839
---规定遍历从属关系树的起点
4
connect by prior
empno=mgr;
----规定了遍历的方向
ENAME||'REPORTFOR'||PRIORENAME
--------------------------------
KING report for
JONES report for KING
SCOTT report for JONES
ADAMS report for SCOTT
FORD report for JONES
SMITH report for FORD
BLAKE report for KING
ALLEN report for BLAKE
WARD report for BLAKE
MARTIN report for BLAKE
TURNER report for BLAKE
JAMES report for BLAKE
CLARK report for KING
MILLER report for CLARK
14 rows selected.
为了使查询的结果更具有可读性,我们借助一个函数SYS_CONNECT_BY_PATH来做一下处理,来看结果:
scott@DB01> SELECT LPAD(' ',
2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno =
7839
4 CONNECT BY PRIOR
empno=mgr;
Path
-------------------------------------------------------------------------------
/KING
/KING/JONES
/KING/JONES/SCOTT
/KING/JONES/SCOTT/ADAMS
/KING/JONES/FORD
/KING/JONES/FORD/SMITH
/KING/BLAKE
/KING/BLAKE/ALLEN
/KING/BLAKE/WARD
/KING/BLAKE/MARTIN
/KING/BLAKE/TURNER
/KING/BLAKE/JAMES
/KING/CLARK
/KING/CLARK/MILLER
14 rows selected.
通过上面的结果,你是否发现,从属关系更加清晰,也更加直观呢。
当然了对于我们来讲,好看的格式化的输出只是锦上添花,更重要的是语句的执行的速度,也就是性能的考虑
借助于autotrace工具,我们来看语句的执行计划
scott@DB01> set autotrace trace exp
scott@DB01>
scott@DB01> SELECT LPAD(' ',
2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno =
7839
4 CONNECT BY PRIOR
empno=mgr;
Execution Plan
----------------------------------------------------------
Plan hash value: 3613731379
---------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows
| Bytes | Cost (%CPU)|
Time
|
---------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 14
| 196
|
3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH
FILTERING
|
|
|
|
|
|
| 2
| TABLE ACCESS BY INDEX ROWID|
EMP
|
|
|
|
|
|* 3
| INDEX
UNIQUE
SCAN
| PK_EMP
|
1
|
4
|
0 (0)| 00:00:01 |
|* 4 | HASH
JOIN
|
|
|
|
|
|
| 5
| CONNECT BY
PUMP
|
|
|
|
|
|
| 6
| TABLE
ACCESS
FULL
| EMP
| 14
| 196
|
3 (0)| 00:00:01 |
| 7
| TABLE ACCESS
FULL
| EMP
| 14
| 196
|
3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("EMPNO"=7839)
3 - access("EMPNO"=7839)
4 - access("MGR"=NULL)
scott@DB01> select w.ename||' report for ' ||m.ename
"Relations"
2 from emp w,emp m
3 where
w.empno=m.empno(+);
Execution Plan
----------------------------------------------------------
Plan hash value: 2199491010
---------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows
| Bytes | Cost (%CPU)|
Time
|
---------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 14
| 280
|
4 (0)| 00:00:01 |
| 1 | NESTED
LOOPS
OUTER
|
| 14
| 280
|
4 (0)| 00:00:01 |
| 2
| TABLE ACCESS
FULL
| EMP
| 14
| 140
|
3 (0)| 00:00:01 |
| 3
| TABLE ACCESS BY INDEX ROWID|
EMP
|
1 | 10
|
1 (0)| 00:00:01 |
|* 4
| INDEX
UNIQUE
SCAN
| PK_EMP
|
1
|
|
0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 -
access("W"."EMPNO"="M"."EMPNO"(+))
通过上面的例子我们可以发现,层次查询的速度也要快于自连接操作,当然了,我们在这里的讨论,并没有考虑两种语句对其他资源的影响。