PG_插件-pg_hint_plan

目录

概念

编译安装

验证

支持的hint参数

实战

 

概念

  pg_hint_plan是一款插件,类似于oracle的hint;用于选择特定的执行计划,进行SQL调优。

编译安装

本文以pg_hint_plan-REL10_1_3_3.tar.gz+pg10.6为例;

#解压

tar xf  pg_hint_plan-REL10_1_3_3.tar.gz

chown -R postgres:postgres pg_hint_plan-REL10_1_3_3

#编译安装

su – postgres

export PATH=/opt/pgsql/bin:$PATH

cd pg_hint_plan-REL10_1_3_3

gmake clean

gmake

gmake install

#编译后日志

/usr/bin/mkdir -p '/opt/pgsql/share/extension'

/usr/bin/mkdir -p '/opt/pgsql/share/extension'

/usr/bin/mkdir -p '/opt/pgsql/lib'

/usr/bin/install -c -m 644 .//pg_hint_plan.control '/opt/pgsql/share/extension/'

/usr/bin/install -c -m 644 .//pg_hint_plan--*.sql  '/opt/pgsql/share/extension/'

/usr/bin/install -c -m 755  pg_hint_plan.so '/opt/pgsql/lib/'

#修改配置,创建插件

在postgresql.conf的shared_preload_libraries配置项加入pg_hint_plan

shared_preload_libraries = 'pg_stat_statements,pg_jieba,rum,pg_pathman,pg_bigm,pg_hint_plan'

#重启PG

/opt/pgsql/bin/pg_ctl -D /pgsql/data_10 restart

#登陆PG,创建插件

postgres=# create extension pg_hint_plan ;

CREATE EXTENSION

 

验证

#建表造数据

create table test11(id int);

insert into test11 select * from generate_series(1,10);

create index idx_test11 on test11 (id);

#验证

postgres=# explain select * from test11 where id=1;

                      QUERY PLAN                      

------------------------------------------------------

 Seq Scan on test11  (cost=0.00..1.12 rows=1 width=4)

   Filter: (id = 1)

使用hint:

postgres=# /*+ IndexScan(test11) */ explain select * from test11 where id=1;

                               QUERY PLAN                                

-------------------------------------------------------------------------

 Index Scan using idx_test11 on test11  (cost=0.14..8.15 rows=1 width=4)

   Index Cond: (id = 1)

 

支持的hint参数

Format:(/*+ SeqScan(table) */)

Description

SeqScan(table)

Forces sequential scan on the table

TidScan(table)

Forces TID scan on the table.

IndexScan(table[ index...])

Forces index scan on the table. Restricts to specified indexes if any.

IndexOnlyScan(table[ index...])

Forces index only scan on the table. Rstricts to specfied indexes if any. Index scan may be used if index only scan is not available. Available for PostgreSQL 9.2 and later.

BitmapScan(table[ index...])

Forces bitmap scan on the table. Restoricts to specfied indexes if any.

NoSeqScan(table)

Forces not to do sequential scan on the table.

NoTidScan(table)

Forces not to do TID scan on the table.

NoIndexScan(table)

Forces not to do index scan and index only scan (For PostgreSQL 9.2 and later) on the table.

NoIndexOnlyScan(table)

Forces not to do index only scan on the table. Available for PostgreSQL 9.2 and later.

NoBitmapScan(table)

Forces not to do bitmap scan on the table.

NestLoop(table table[ table...])

Forces nested loop for the joins consist of the specifiled tables.

HashJoin(table table[ table...])

Forces hash join for the joins consist of the specifiled tables.

MergeJoin(table table[ table...])

Forces merge join for the joins consist of the specifiled tables.

NoNestLoop(table table[ table...])

Forces not to do nested loop for the joins consist of the specifiled tables.

NoHashJoin(table table[ table...])

Forces not to do hash join for the joins consist of the specifiled tables.

NoMergeJoin(table table[ table...])

Forces not to do merge join for the joins consist of the specifiled tables.

Leading(table table[ table...])

Forces join order as specified.

Leading()

Forces join order and directions as specified. A join pair is a pair of tables and/or other join pairs enclosed by parentheses, which can make a nested structure.

Rows(table table[ table...] correction)

Corrects row number of a result of the joins consist of the specfied tables. The available correction methods are absolute (#), addition (+), subtract (-) and multiplication (*). should be a string that strtod() can read.

Set(GUC-param value)

Set the GUC parameter to the value while planner is running.

 

实战效果

原始SQL:

explain (ANALYZE true,buffers true)

WITH A AS (

SELECT

  RESULT.x1

FROM

  T1 TASK

  LEFT JOIN

T2 RELATE

  ON RELATE.x2 = TASK.

  ID

LEFT JOIN t3 RESULT ON RELATE.x3 = RESULT.x4

WHERE

1 = 1

AND TASK.ID = 'xxxxxxxxxx'

AND RESULT.DATA_SOURCE IN ( 'P12', 'P14', 'P02', 'P17', 'P26' )

AND RESULT.CROWD_NAME IS NOT NULL

)

SELECT

COUNT( A.entity_id ) AS cnt

  FROM

  A;

对应执行计划:

Aggregate  (cost=113379.07..113379.08 rows=1 width=8) (actual time=2772.901..2772.901 rows=1 loops=1)

  Buffers: shared hit=719635, temp written=2607

  CTE a

    ->  Nested Loop  (cost=0.57..113105.31 rows=12167 width=4) (actual time=0.050..2077.828 rows=1525996 loops=1)

          Buffers: shared hit=719635

          ->  Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate  (cost=0.14..3.33 rows=1 width=18) (actual time=0.010..0.012 rows=1 loops=1)

                Index Cond: (task_id = 'xxxxxx'::text)

                Heap Fetches: 1

                Buffers: shared hit=3

          ->  Index Scan using zdr_arrive_local_cache_datasource_crowd_id on zdr_arrive_local_cache result  (cost=0.43..112493.64 rows=60834 width=12) (actual time=0.037..1876.879 rows=1525996 loops=1)

                Index Cond: (((crowd_id)::text = (relate.relate_code)::text) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[])))

                Filter: (crowd_name IS NOT NULL)

                Buffers: shared hit=719632

  ->  CTE Scan on a  (cost=0.00..243.34 rows=12167 width=4) (actual time=0.051..2566.068 rows=1525996 loops=1)

        Buffers: shared hit=719635, temp written=2607

Planning time: 0.423 ms

Execution time: 2778.586 ms

 

问题:

标红处可以看到调用索引的情况下,返回了 1525996  条数据(该表总数据量两千万),明显返回这么大数据集的情况下,走索引造成的随机IO相当惊人,优化可以从耗时长的那一步入手,可以尝试调用顺序扫描。此索引有用于其他SQL不能被删除。

 

优化后SQL:去掉不必要的表连接,改用顺序扫描

/*+ noindexscan(result) */

explain (ANALYZE true,buffers true)

WITH A AS (

Xxxxxx

 

对应执行计划:

 

Aggregate  (cost=489540.59..489540.60 rows=1 width=8) (actual time=1134.751..1134.751 rows=1 loops=1)

  Buffers: shared hit=472204, temp written=2667

  CTE a

    ->  Gather  (cost=3.35..489266.83 rows=12167 width=4) (actual time=0.458..512.064 rows=1560819 loops=1)

          Workers Planned: 6

          Workers Launched: 6

          Buffers: shared hit=472204

          ->  Hash Join  (cost=3.35..489266.83 rows=2028 width=4) (actual time=0.340..667.536 rows=222974 loops=7)

                Hash Cond: ((result.crowd_id)::text = (relate.relate_code)::text)

                Buffers: shared hit=472204

                ->  Parallel Seq Scan on zdr_arrive_local_cache result  (cost=0.00..488458.18 rows=209339 width=12) (actual time=0.153..599.215 rows=227961 loops=7)

                      Filter: ((crowd_name IS NOT NULL) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[])))

                      Rows Removed by Filter: 734658

                      Buffers: shared hit=471811

                ->  Hash  (cost=3.33..3.33 rows=1 width=18) (actual time=0.069..0.069 rows=1 loops=7)

                      Buckets: 1024  Batches: 1  Memory Usage: 9kB

                      Buffers: shared hit=111

                      ->  Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate  (cost=0.14..3.33 rows=1 width=18) (actual time=0.066..0.067 rows=1 loops=7)

                            Index Cond: (task_id = '76e7b381-1f86-47d7-a2f3-deac63c2e75f'::text)

                            Heap Fetches: 1

                            Buffers: shared hit=111

  ->  CTE Scan on a  (cost=0.00..243.34 rows=12167 width=4) (actual time=0.459..951.253 rows=1560819 loops=1)

        Buffers: shared hit=472204, temp written=2667

Planning time: 0.490 ms

Execution time: 1179.083 ms

性能提升2.35

 

 

底线返回TOP

上一篇:通过备份 Etcd 来完美恢复 Kubernetes 中的误删数据


下一篇:牛客网The Flee Plan of Groundhog