作者:杨云龙,瀚高PG实验室核心成员,数据库高级工程师,擅长HGDB、Post-greSQL、Oracle等主流数据库。
表连接方式方法介绍(Nested Loop/Hash Join/Merge Join/Join) 数据库版本(oracle11.2.0.4 and PostgreSQL 13.1)
环境构造
-- oracle11.2.0.4
--城市、国家
drop table country purge;
CREATE TABLE country (
country_id int primary key,
country_name VARCHAR(50) NOT NULL
);
drop table city purge;
CREATE TABLE city (
city_id int primary key,
city_name VARCHAR(50) NOT NULL,
country_id int NOT NULL
);
begin
for i in 1 .. 10 loop
insert into country values(i,'country'||i);
end loop;
commit;
end;
begin
for i in 1 .. 10000 loop
insert into city values(i,'city'||i,ceil(i/1000));
end loop;
commit;
end;
execute dbms_stats.gather_table_stats(ownname =>'SCOTT',tabname => 'CITY' ,estimate_percent=> 100 ,cascade => true);
execute dbms_stats.gather_table_stats(ownname =>'SCOTT',tabname => 'COUNTRY' ,estimate_percent=> 100 ,cascade => true);
-- PostgreSQL 13.1
drop table country;
CREATE TABLE country (
country_id integer primary key,
country_name text NOT NULL
);
drop table city purge;
CREATE TABLE city (
city_id integer primary key,
city_name text NOT NULL,
country_id integer NOT NULL
);
insert into country values (generate_series(1,10),'country'||generate_series(1,10));
insert into city values(generate_series(1,10000),'city'||generate_series(1,10000),ceil(random()*(10-1)+1));
analyze city;
analyze country;
Nested Loop
如上图所示,为Nested Loop方式介绍。 其算法:驱动表返回一行数据,通过连接列传值给被驱动表,驱动表返回多少行,被驱动表就要被扫描多少次。 被驱动表索引方式索引唯一扫描或者范围扫描。在被驱动表数据子集较少的情况下,嵌套循环是比较好的选择,也就是适合结果集比较小的查询,通常超过10000行被认为大行,会变成低效。 举例如下:
Oracle 执行举例
SQL> explain plan for select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id=99;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 2738185913
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 93 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 93 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| CITY | 1 | 53 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | SYS_C0018242 | 1 | | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 1 | 40 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | SYS_C0018239 | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("CITY"."CITY_ID"=99)
5 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID")
18 rows selected.
语句对比
--下面方式 oracle会选择 hash 全表扫描
explain plan for select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id>=10;
--强制使用nested loop 方式
explain plan for select /*+ leading(city) use_nl(country) */ city_name,country_name from city,country
where city.country_id=country.country_id and city.city_id>=10;
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 103883790
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9991 | 907K| 10005 (1)| 00:02:01 |
| 1 | NESTED LOOPS | | 9991 | 907K| 10005 (1)| 00:02:01 |
| 2 | NESTED LOOPS | | 9991 | 907K| 10005 (1)| 00:02:01 |
|* 3 | TABLE ACCESS FULL | CITY | 9991 | 517K| 11 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | SYS_C0018239 | 1 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 1 | 40 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("CITY"."CITY_ID">=10)
4 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID")
Note
-----
- dynamic sampling used for this statement (level=2)
22 rows selected.
--由上所示,被驱动别执行了 9991次,也就是(10000-9),资源消耗明显上升
Postgresql 执行举例
mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id=99;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.44..16.49 rows=1 width=40) (actual time=0.031..0.036 rows=1 loops=1)
-> Index Scan using city_pkey on city (cost=0.29..8.30 rows=1 width=12) (actual time=0.015..0.017 rows=1 loops=1)
Index Cond: (city_id = 99)
-> Index Scan using country_pkey on country (cost=0.15..8.17 rows=1 width=36) (actual time=0.006..0.006 rows=1 loops=1)
Index Cond: (country_id = city.country_id)
Planning Time: 0.621 ms
Execution Time: 0.098 ms
(7 rows)
--返回多行,强制使用nested loop,比默认选择hash join 多30ms
mydb=# set enable_hashjoin=off;
SET
mydb=# set enable_mergejoin=off;
SET
mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id>=10;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.15..1934.29 rows=9991 width=40) (actual time=0.050..43.354 rows=9991 loops=1)
-> Seq Scan on city (cost=0.00..188.00 rows=9991 width=12) (actual time=0.025..4.870 rows=9991 loops=1)
Filter: (city_id >= 10)
Rows Removed by Filter: 9
-> Index Scan using country_pkey on country (cost=0.15..0.17 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=9991)
Index Cond: (country_id = city.country_id)
Planning Time: 0.393 ms
Execution Time: 45.366 ms
(8 rows)
PG Nested loop说明
Postgresql使用嵌套循环执行查询,那么它可以通过迭代表1中的所有条目、迭代表2中的所有条目,然后在表1和表2中的行对满足筛选条件时发出一行。嵌套循环是Postgresql唯一可以用来处理任何连接的连接算法。例如下面语句,Oracle会选择Merge Join,Postgresql则选择 Nested Loop,详细可参考 Merge Join部分。
--即使数据量较大情况,以下语句pg还会选择Nested Loop
explain select a.city_name,b.city_id from city a,city2 b where a.country_id<b.country_id;
Hash Join
两表关联时主要依靠哈希运算来得到结果集的表连接方式,只支持等值。 算法:两表等值连接,返回大量数据,较少的表选为驱动表,将驱动表相关列读入PGA中的work area(PG则放入内存work_mem),然后对驱动表的连接列进行hash运算生成hash table,然后读取被驱动表并对关联列进行hash运算,然后到pga探测hash table,找到数据关联上。如果HASH表太大,无法一次构造在内存中,则分成若干个partition,写入磁盘的temporary segment,则会多一个写的代价,会降低效率
上图为Hash大概流程,如果统计信息等都准确,数据库会自动选择最好执行计划。当使用ORDERED提示时,FROM子句中的第一张表将用于建立哈希表。
Oracle举例
-- country 作为驱动表,不管使用那种join(left/right/full)方式,Oracle都会选择小表为驱动表。
SQL> explain plan for select city_name,country_name from city,country where city.country_id=country.country_id;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 114462077
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 781K| 14 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 10000 | 781K| 14 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| COUNTRY | 10 | 400 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| CITY | 10000 | 390K| 11 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID")
Note
-----
- dynamic sampling used for this statement (level=2)
19 rows selected.
Postgresql举例
mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Join (cost=38.58..227.91 rows=10000 width=40) (actual time=0.185..12.086 rows=10000 loops=1)
Hash Cond: (city.country_id = country.country_id)
-> Seq Scan on city (cost=0.00..163.00 rows=10000 width=12) (actual time=0.026..3.735 rows=10000 loops=1)
-> Hash (cost=22.70..22.70 rows=1270 width=36) (actual time=0.064..0.067 rows=10 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 17kB
-> Seq Scan on country (cost=0.00..22.70 rows=1270 width=36) (actual time=0.025..0.031 rows=10 loops=1)
Planning Time: 1.953 ms
Execution Time: 13.983 ms
(8 rows)
为了可以使用小表建立hash table,优化器会转换,如 right join 被改为Left Join
mydb=# explain analyze select * from country right join city on city.country_id=country.country_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=38.58..227.91 rows=10000 width=52) (actual time=0.087..11.883 rows=10000 loops=1)
Hash Cond: (city.country_id = country.country_id)
-> Seq Scan on city (cost=0.00..163.00 rows=10000 width=16) (actual time=0.020..3.035 rows=10000 loops=1)
-> Hash (cost=22.70..22.70 rows=1270 width=36) (actual time=0.044..0.047 rows=10 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 17kB
-> Seq Scan on country (cost=0.00..22.70 rows=1270 width=36) (actual time=0.024..0.028 rows=10 loops=1)
Planning Time: 0.413 ms
Execution Time: 13.588 ms
(8 rows)
Merge Join
两张表在做连接时用排序操作和合并操作来得到结果集的连接方式。排序主要处理非等值管理。算法:先对两张表根据连接列各自进行排序,嵌套循环是从被驱动表的索引中匹配数据,排序合并连接是从内存(PGA中的work area)中匹配数据,严格来说没有驱动表,可以认为较少表作为驱动表。HASH JOIN只需要把驱动表放入PGA中,但是排序连接合并需要将两表结果集均放入PGA中
上图为Merge大概流程,如果统计信息等都准确,数据库会自动选择最好执行计划。
Oracle举例
col PLAN_TABLE_OUTPUT for a100
set lines 200 pages 999
explain plan for select a.city_name,b.country_name from city a,country b where a.country_id<b.country_id;
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1026867539
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 45000 | 1054K| | 62 (2)| 00:00:01 |
| 1 | MERGE JOIN | | 45000 | 1054K| | 62 (2)| 00:00:01 |
| 2 | SORT JOIN | | 10 | 120 | | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 10 | 120 | | 2 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | SYS_C0018239 | 10 | | | 1 (0)| 00:00:01 |
|* 5 | SORT JOIN | | 10000 | 117K| 408K| 60 (2)| 00:00:01 |
| 6 | TABLE ACCESS FULL | CITY | 10000 | 117K| | 11 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access(INTERNAL_FUNCTION("A"."COUNTRY_ID")<INTERNAL_FUNCTION("B"."COUNTRY_ID"))
filter(INTERNAL_FUNCTION("A"."COUNTRY_ID")<INTERNAL_FUNCTION("B"."COUNTRY_ID"))
19 rows selected.
Postgresql举例
merge join需要两个JOIN的表的KEY都是先排好顺序的,有索引的话没有排序过程,merge join的好处是两个表都只扫描一次. 而nested loop的话其中一个表扫描一次, 另一个表则循环多次.
Postgresql中,Merge Join只适用于自然连接和等值连接,如下例所示,即使将enable_nestloop关闭,pg在此种连接方式时都会转变成Nested Loop,且初始成本估算很大。由此看出,Postgresql与Oracle 机制有区别。
mydb=# analyze city;
ANALYZE
mydb=# analyze country;
ANALYZE
mydb=# set enable_nestloop=off;
SET
mydb=# explain analyze select a.city_name,b.country_name from city a,country b where a.country_id<b.country_id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=10000000000.00..10000016637.12 rows=333333 width=18) (actual time=0.104..419.608 rows=400687 loops=1)
Join Filter: (a.country_id < b.country_id)
Rows Removed by Join Filter: 599313
-> Seq Scan on city a (cost=0.00..1636.00 rows=100000 width=13) (actual time=0.066..20.058 rows=100000 loops=1)
-> Materialize (cost=0.00..1.15 rows=10 width=13) (actual time=0.000..0.001 rows=10 loops=100000)
-> Seq Scan on country b (cost=0.00..1.10 rows=10 width=13) (actual time=0.014..0.019 rows=10 loops=1)
Planning Time: 0.649 ms
Execution Time: 457.499 ms
(8 rows)
hash join 只有当在内存中执行是最快连接方式,当pg认为hash连接所需超出内存时,会选择Merge join,Merge Join可以更有效使用磁盘。Postgresql中,Merge Join仅适用于条件为相等的连接。其工作原理,比较每个输出的第一行(最小值),如果相等,结果中添加一行,并从input删除已比较的两行;如果字段不相同,删除id最小的,并继续与下一个最小的对比。
Postgres可以非常有效地在磁盘上执行合并,方法是将数据的一小部分保存在内存中,对该数据执行合并,将合并结果写入磁盘,然后从磁盘将更多的输入读取到内存中。这种方法一次在磁盘上读写大量的数据,比Hash 随机读高效。
mydb=# explain select a.city_name,b.city_id from city a,city2 b where a.country_id=b.country_id and a.city_id<30000;
QUERY PLAN
--------------------------------------------------------------------------------------
Hash Join (cost=3274.21..3764393.24 rows=333242482 width=13)
Hash Cond: (a.country_id = b.country_id)
-> Index Scan using city_pkey on city a (cost=0.29..1052.87 rows=30033 width=13)
Index Cond: (city_id < 30000)
-> Hash (cost=1634.63..1634.63 rows=99863 width=8)
-> Seq Scan on city2 b (cost=0.00..1634.63 rows=99863 width=8)
(6 rows)
mydb=# set enable_hashjoin=off;
SET
mydb=# explain select a.city_name,b.city_id from city a,city2 b where a.country_id=b.country_id and a.city_id<30000;
QUERY PLAN
--------------------------------------------------------------------------------------------
Merge Join (cost=13213.55..5012000.94 rows=333242482 width=13)
Merge Cond: (a.country_id = b.country_id)
-> Sort (cost=3286.46..3361.55 rows=30033 width=13)
Sort Key: a.country_id
-> Index Scan using city_pkey on city a (cost=0.29..1052.87 rows=30033 width=13)
Index Cond: (city_id < 30000)
-> Sort (cost=9927.09..10176.74 rows=99863 width=8)
Sort Key: b.country_id
-> Seq Scan on city2 b (cost=0.00..1634.63 rows=99863 width=8)
(9 rows)
Merge Join方法主要是内存不够用时,效果更好
当数据量特别大时,会选择Merge Join,如果没有索引,会选择"Materialize",将结果缓存内存或文件中,避免多次全表扫描。如下所示:
mydb=# create unlogged table tbl1(id int, info text) with (autovacuum_enabled=off);
CREATE TABLE
mydb=# create unlogged table tbl2(id int, info text) with (autovacuum_enabled=off);
CREATE TABLE
mydb=# insert into tbl1 select generate_series(1,10000000);
INSERT 0 10000000
mydb=# insert into tbl2 select generate_series(1,10000000);
INSERT 0 10000000
mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id;
QUERY PLAN
----------------------------------------------------------------------------------
Merge Join (cost=2075531.84..2370522825.11 rows=157893676470 width=72)
Merge Cond: (tbl1.id = tbl2.id)
-> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36)
Sort Key: tbl1.id
-> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36)
-> Materialize (cost=1037765.92..1065863.40 rows=5619496 width=36)
-> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36)
Sort Key: tbl2.id
-> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36)
(9 rows)
mydb=# explain select * from tbl1,tbl2 where tbl1.id<tbl2.id;
QUERY PLAN
----------------------------------------------------------------------------
Nested Loop (cost=0.00..720393977232.90 rows=10526245098005 width=72)
Join Filter: (tbl1.id < tbl2.id)
-> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36)
-> Materialize (cost=0.00..172443.44 rows=5619496 width=36)
-> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36)
(5 rows)
mydb=# set enable_mergejoin=off;
SET
mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=214589.66..5526742276.56 rows=157893676470 width=72)
Hash Cond: (tbl1.id = tbl2.id)
-> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36)
-> Hash (cost=100442.96..100442.96 rows=5619496 width=36)
-> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36)
(5 rows)
mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id and tbl1.id<2000;
QUERY PLAN
----------------------------------------------------------------------------------
Merge Join (cost=1449858.75..790941515.17 rows=52631216124 width=72)
Merge Cond: (tbl1.id = tbl2.id)
-> Sort (cost=412092.83..416775.74 rows=1873165 width=36)
Sort Key: tbl1.id
-> Seq Scan on tbl1 (cost=0.00..114491.70 rows=1873165 width=36)
Filter: (id < 2000)
-> Materialize (cost=1037765.92..1065863.40 rows=5619496 width=36)
-> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36)
Sort Key: tbl2.id
-> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36)
(10 rows)
三种连接方法的优劣
还有一种连接叫笛卡尔积(Cross Join/Cartesian Product),两表关联没有任何条件。结果集为两张表行数的乘积,此种方式基本不会用到,这次不做讨论。
类别 | Nested Loop | Hash Join | Merge Join |
---|---|---|---|
使用条件 | 任何条件 | 等值连接 | 非等值,’<>'除外 |
使用资源 | CPU, disk IO | 内存、临时空间 | 内存、临时空间 |
特点 | 高选择索引或进行限制性搜索时效率高,能快速返回第一次搜索结果 | 表记录多,缺乏有效索引, 效率高 | 非等值连接,效率高 |
缺点 | 数据量大效率低,尤其无索引或限制条件不够时 | 需要大量内存,可能会用磁盘 | 表都需要排序,结果最后返回,PG批量依次内存排序 |
Join连接类型
inner Join
--结果只包含满足连接条件的记录
select a.id,a.col1,b.col2 from a,b where a.id=b.id;
--or
select a.id,a.col1,b.col2 from a inner join b on a.id=b.id;
--or
select id,a.col1,b.col2 from a inner join b using(id);
Full outer Join
SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name;
ID NAME ID NAME
---------- ------ ---------- ------
1 aaa 1 aaa
2 bbb
3 ccc
--如果没有匹配记录,null作为值,可以去掉相同结果集
SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name
where tablea.id is null or tableb.id is null;
ID NAME ID NAME
---------- ------ ------ ------
2 bbb
3 ccc
left/right join
--如上图所示,产生A表所有结果,B没有的用null作为值,除相同值外,right与left正好相反
SELECT * FROM TableA left JOIN TableB ON TableA.name = TableB.name;
ID NAME ID NAME
---------- ------ ---------- ------
1 aaa 1 aaa
3 ccc
--可以去掉相同结果集
SELECT * FROM TableA left JOIN TableB ON TableA.name = TableB.name
where tableb.id is null;
ID NAME ID NAME
---------- ------ ---------- ------
3 ccc
union 介绍
UNION 操作符用于合并两个或多个 SELECT 语句的结果集。
请注意,UNION 内部的 SELECT 语句必须拥有相同数量的列。列也必须拥有相似的数据类型。同时,每条 SELECT 语句中的列的顺序必须相同。UNION 只选取记录,有去重功能,而UNION ALL会列出所有记录。
SELECT name FROM TableA UNION SELECT name FROM TableB;
NAME
------------------------------
aaa
bbb
SELECT name FROM TableA UNION all SELECT name FROM TableB;
NAME
------------------------------
aaa
aaa
bbb
有些情况下可以使用union 代替or ,提供查询效率
select * from t1 where owner='SCOTT' or object_id in (select object_id from t2);
--上述语句会使用FILTER,可以调整sql,或者t2.object_id创建索引
select * from t1 where owner='SCOTT'
union
select * from t1 where object_id in (select object_id from t2);
--列转行举例
select employee,child_1 as child from personnel
union all
select employee,child_2 as child from personnel;
自连接
分页查询
--四层分页更适合页数比较大的分页查询,返回结果集小的当做驱动表,扫描被驱动表。
with tmp as
(SELECT * FROM
( SELECT rid, ROWNUM as RN
FROM
(SELECT rowid as rid
FROM t1
where owner='SYS'
order by object_id desc
) WHERE ROWNUM <= 500
) WHERE RN > 490
)
select /*+ use_nl(a) leading(b) */ owner,object_name,object_id,rn
from t1 a,tmp b
where a.rowid=b.rid;
--三层select是常用的
select * from (select owner,object_name,object_id,rownum as rn
from(select owner,object_name,object_id from t1 where owner='SYS' order by object_id desc)
where rownum<=500)
where rn>490;
删除重复行
--极致函数
delete from products p1
where rowid < (
select max(p2.rowid) from products p2
where p1.name = p2.name
and p1.price =p2.price
);
--使用非等值连接
delete from products p1
where exists (
select * from products p2
where p1.name = p2.name
and p1.price = p2.price
and p1.rowid < p2.rowid
);
关联子查询比较行与行 eg
--求与上一年营业额一样的年份
select year,sale from sales s1
where sale = (
select sale from sales s2
where s2.year = s1.year-1
) order by year;
--or
select s1.year,s1.sale from sales s1,sales s2
where s2.sale =s1.sale
and s2.year = s1.year-1
order by year;
--求同期对比情况
select s1.year,s1.sale,
case when s1.sale = s2.sale then '持平'
when s1.sale > s2.sale then '增长'
when s1.sale < s2.sale then '减少'
else '-' end as var
from sales s1,sales s2
where s2.year = s1.year -1
order by year;
参考文档
- Join introduce: https://www.cnblogs.com/xqzt/p/4469673.html
- Join introduce: https://logicalread.com/
- pg join method: http://www.jasongj.com/2015/03/07/Join1/
- pg join method: https://developer.aliyun.com/article/70992
- pg join method: https://www.interdb.jp/pg/pgsql03.html
- pg Merge Joins: https://malisper.me/postgres-merge-joins/
- 表连接方法参考 :书籍 《SQL 优化核心思想》
- SQL连接示例参考:书籍 《SQL进阶教程》