Postgresql聚簇索引cluster使用说明

2022-01-16 14:35:47

基于test_id_idx来聚簇public.test表数据

CLUSTER public.test USING test_id_idx;
CLUSTER public.test; -- 自动执行上次的聚簇

官方手册：http://postgres.cn/docs/9.6/sql-cluster.html

使用场景

参考以下例子，SQL执行计划

 -- sql语句大概是 select xxx,count(*) from test where id < 317 and id > 0 group by xxx
 -- 执行计划一部分如下：
 ->  Bitmap Heap Scan on test  (cost=26785.44..49868.47 rows=7141 width=222) (actual time=1113.964..3513.711 rows=50849 loops=1)
      Recheck Cond: ((id < 317) AND (id > 0))
      Heap Blocks: exact=50616
      Buffers: shared hit=33 read=62433 written=23

如上Bitmap扫描出exact=50616个block（id < 317 and id > 0 这个条件匹配到50616分block），随机扫描的block太多导致sql性能非常低。

针对该场景我们可以让test表基于id字段顺序分布在磁盘中，这样可以大大减少扫描的block。

CLUSTER public.test USING test_id_idx;

执行上面这个sql可以让id < 317的数据顺序分布在磁盘中，大大减低执行sql是扫描的block，提高查询性能

码农公寓

相关文章