PostgreSQL 11 版本索引方面一个显著的新特性是创建索引时支持 INCLUDE COLUMNS ,语法如下
CREATE INDEX idx_name ON table_name USING BTREE (column_a) INCLUDE (column_b);
一、Release 中的说明
Allow indexes to INCLUDE columns that are not part of the unique constraint but are available for index-only scans (Anastasia Lubennikova, Alexander Korotkov, Teodor Sigaev)
This is also useful for including columns that dont have btree support
此特性主要用途和使用场景:
- 如果字段不支持btree索引,可以使用INCLUDE方式索引。
- 使表上的更多SQL能走 Index-Only Scans。
以上描述颇为费力,以下通过实例演示。
二、验证: 不支持Btree索引的字段,使用Include方式索引
首先验证第一点:不支持Btree索引的字段支持使用INCLUDE方式索引,创建测试表。
francs=> CREATE TABLE t_json1(a serial, user_info json);
CREATE TABLE
在(a,user_info) 字段上创建 btree 索引,如下:
francs=> CREATE INDEX t_json1_idx1 ON t_json1 USING BTREE(a,user_info);
ERROR: data type json has no default operator class for access method "btree"
HINT: You must specify an operator class for the index or define a default operator class for the data type.
以上创建索引报错,是由于 json 不支持 btree 索引。
使用 INCLUDE 方式创建索引成功,如下:
francs=> CREATE INDEX t_json1_idx_include ON t_json1 USING btree(a) INCLUDE(user_info);
CREATE INDEX
三、验证: Include Columns 支持 Index-Only Scans
创建测试表并插入300万数据,如下:
francs=> CREATE TABLE t_include(a int4, name text);
CREATE TABLE
francs=> INSERT INTO t_include(a,name) SELECT n,n || '_INCLUDE TEST' FROM generate_series(1,3000000) n;
INSERT 0 3000000
在字段a上创建索引,如下:
francs=> CREATE INDEX idx_t_include_a ON t_include USING BTREE (a);
CREATE INDEX
where条件中只包含a,查询a字段,以下SQL走了 Index Only Scan。
francs=> EXPLAIN ANALYZE SELECT a FROM t_include WHERE a<5;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_t_include_a on t_include (cost=0.43..8.50 rows=4 width=4) (actual time=0.017..0.018 rows=4 loops=1)
Index Cond: (a < 5)
Heap Fetches: 4
Planning Time: 0.272 ms
Execution Time: 0.038 ms
(5 rows)
加入 name 字段后,不走 Index Only Scan,如下。
francs=> EXPLAIN ANALYZE SELECT a,name FROM t_include WHERE a<5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_t_include_a on t_include (cost=0.43..8.50 rows=4 width=24) (actual time=0.005..0.007 rows=4 loops=1)
Index Cond: (a < 5)
Planning Time: 0.125 ms
Execution Time: 0.025 ms
(4 rows)
加入 name 后走了 Index Scan using ,根据索引回表查询name字段。
创建索引时使用 INCLUDE(name),如下
francs=> CREATE INDEX idx_t_include ON t_include USING BTREE (a) INCLUDE (name);
CREATE INDEX
再次执行查询,走了 Index Only Scan
francs=> VACUUM ANALYZE t_include;
VACUUM
francs=> EXPLAIN ANALYZE SELECT a,name FROM t_include WHERE a<5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_t_include on t_include (cost=0.43..4.50 rows=4 width=24) (actual time=0.017..0.018 rows=4 loops=1)
Index Cond: (a < 5)
Heap Fetches: 0
Planning Time: 0.175 ms
Execution Time: 0.038 ms
(5 rows)
只查询name字段,也走了 Index Only Scan,如下
francs=> EXPLAIN ANALYZE SELECT name FROM t_include WHERE a<5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_t_include on t_include (cost=0.43..4.50 rows=4 width=20) (actual time=0.012..0.014 rows=4 loops=1)
Index Cond: (a < 5)
Heap Fetches: 0
Planning Time: 0.163 ms
Execution Time: 0.038 ms
(5 rows)
四、Include 索引的限制
- 目前只有 Btree 索引支持 INCLUDE COLUMNS(INCLUDE中的字段物理上位于btree索引叶子节点)。
- INCLUDE COLUMNS 中的字段不支持函数索引。
- INCLUDE COLUMNS 中的字段数据类型可以不支持 btree 索引,例如 JSON 等数据类型。
五、两种索引方式差异
或许有朋友问以下两种索引方式有啥区别? 这个问题欢迎大家留言讨论。
CREATE INDEX idx_t_not_include ON t_include USING BTREE (a,name);
CREATE INDEX idx_t_include ON t_include USING BTREE (a) INCLUDE (name);
六、参考
- CREATE INDEX
- Postgres 11 highlight - Covering Indexes
- WAITING FOR POSTGRESQL 11 – INDEXES WITH INCLUDE COLUMNS AND THEIR SUPPORT IN B-TREE
新书推荐
最后推荐和张文升共同编写的《PostgreSQL实战》,本书基于PostgreSQL 10 编写,共18章,重点介绍SQL高级特性、并行查询、分区表、物理复制、逻辑复制、备份恢复、高可用、性能优化、PostGIS等,涵盖大量实战用例!