目录
概念
pg_hint_plan是一款插件,类似于oracle的hint;用于选择特定的执行计划,进行SQL调优。
编译安装
本文以pg_hint_plan-REL10_1_3_3.tar.gz+pg10.6为例;
#解压 tar xf pg_hint_plan-REL10_1_3_3.tar.gz chown -R postgres:postgres pg_hint_plan-REL10_1_3_3 #编译安装 su – postgres export PATH=/opt/pgsql/bin:$PATH cd pg_hint_plan-REL10_1_3_3 gmake clean gmake gmake install #编译后日志 /usr/bin/mkdir -p '/opt/pgsql/share/extension' /usr/bin/mkdir -p '/opt/pgsql/share/extension' /usr/bin/mkdir -p '/opt/pgsql/lib' /usr/bin/install -c -m 644 .//pg_hint_plan.control '/opt/pgsql/share/extension/' /usr/bin/install -c -m 644 .//pg_hint_plan--*.sql '/opt/pgsql/share/extension/' /usr/bin/install -c -m 755 pg_hint_plan.so '/opt/pgsql/lib/' #修改配置,创建插件 在postgresql.conf的shared_preload_libraries配置项加入pg_hint_plan shared_preload_libraries = 'pg_stat_statements,pg_jieba,rum,pg_pathman,pg_bigm,pg_hint_plan' #重启PG /opt/pgsql/bin/pg_ctl -D /pgsql/data_10 restart #登陆PG,创建插件 postgres=# create extension pg_hint_plan ; CREATE EXTENSION |
验证
#建表造数据 create table test11(id int); insert into test11 select * from generate_series(1,10); create index idx_test11 on test11 (id); #验证 postgres=# explain select * from test11 where id=1; QUERY PLAN ------------------------------------------------------ Seq Scan on test11 (cost=0.00..1.12 rows=1 width=4) Filter: (id = 1) 使用hint: postgres=# /*+ IndexScan(test11) */ explain select * from test11 where id=1; QUERY PLAN ------------------------------------------------------------------------- Index Scan using idx_test11 on test11 (cost=0.14..8.15 rows=1 width=4) Index Cond: (id = 1) |
支持的hint参数
Format:(/*+ SeqScan(table) */) |
Description |
SeqScan(table) |
Forces sequential scan on the table |
TidScan(table) |
Forces TID scan on the table. |
IndexScan(table[ index...]) |
Forces index scan on the table. Restricts to specified indexes if any. |
IndexOnlyScan(table[ index...]) |
Forces index only scan on the table. Rstricts to specfied indexes if any. Index scan may be used if index only scan is not available. Available for PostgreSQL 9.2 and later. |
BitmapScan(table[ index...]) |
Forces bitmap scan on the table. Restoricts to specfied indexes if any. |
NoSeqScan(table) |
Forces not to do sequential scan on the table. |
NoTidScan(table) |
Forces not to do TID scan on the table. |
NoIndexScan(table) |
Forces not to do index scan and index only scan (For PostgreSQL 9.2 and later) on the table. |
NoIndexOnlyScan(table) |
Forces not to do index only scan on the table. Available for PostgreSQL 9.2 and later. |
NoBitmapScan(table) |
Forces not to do bitmap scan on the table. |
NestLoop(table table[ table...]) |
Forces nested loop for the joins consist of the specifiled tables. |
HashJoin(table table[ table...]) |
Forces hash join for the joins consist of the specifiled tables. |
MergeJoin(table table[ table...]) |
Forces merge join for the joins consist of the specifiled tables. |
NoNestLoop(table table[ table...]) |
Forces not to do nested loop for the joins consist of the specifiled tables. |
NoHashJoin(table table[ table...]) |
Forces not to do hash join for the joins consist of the specifiled tables. |
NoMergeJoin(table table[ table...]) |
Forces not to do merge join for the joins consist of the specifiled tables. |
Leading(table table[ table...]) |
Forces join order as specified. |
Leading() |
Forces join order and directions as specified. A join pair is a pair of tables and/or other join pairs enclosed by parentheses, which can make a nested structure. |
Rows(table table[ table...] correction) |
Corrects row number of a result of the joins consist of the specfied tables. The available correction methods are absolute (#), addition (+), subtract (-) and multiplication (*). should be a string that strtod() can read. |
Set(GUC-param value) |
Set the GUC parameter to the value while planner is running. |
实战效果
原始SQL: explain (ANALYZE true,buffers true) WITH A AS ( SELECT RESULT.x1 FROM T1 TASK LEFT JOIN T2 RELATE ON RELATE.x2 = TASK. ID LEFT JOIN t3 RESULT ON RELATE.x3 = RESULT.x4 WHERE 1 = 1 AND TASK.ID = 'xxxxxxxxxx' AND RESULT.DATA_SOURCE IN ( 'P12', 'P14', 'P02', 'P17', 'P26' ) AND RESULT.CROWD_NAME IS NOT NULL ) SELECT COUNT( A.entity_id ) AS cnt FROM A; 对应执行计划: Aggregate (cost=113379.07..113379.08 rows=1 width=8) (actual time=2772.901..2772.901 rows=1 loops=1) Buffers: shared hit=719635, temp written=2607 CTE a -> Nested Loop (cost=0.57..113105.31 rows=12167 width=4) (actual time=0.050..2077.828 rows=1525996 loops=1) Buffers: shared hit=719635 -> Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate (cost=0.14..3.33 rows=1 width=18) (actual time=0.010..0.012 rows=1 loops=1) Index Cond: (task_id = 'xxxxxx'::text) Heap Fetches: 1 Buffers: shared hit=3 -> Index Scan using zdr_arrive_local_cache_datasource_crowd_id on zdr_arrive_local_cache result (cost=0.43..112493.64 rows=60834 width=12) (actual time=0.037..1876.879 rows=1525996 loops=1) Index Cond: (((crowd_id)::text = (relate.relate_code)::text) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[]))) Filter: (crowd_name IS NOT NULL) Buffers: shared hit=719632 -> CTE Scan on a (cost=0.00..243.34 rows=12167 width=4) (actual time=0.051..2566.068 rows=1525996 loops=1) Buffers: shared hit=719635, temp written=2607 Planning time: 0.423 ms Execution time: 2778.586 ms
|
问题:
标红处可以看到调用索引的情况下,返回了 1525996 条数据(该表总数据量两千万),明显返回这么大数据集的情况下,走索引造成的随机IO相当惊人,优化可以从耗时长的那一步入手,可以尝试调用顺序扫描。此索引有用于其他SQL不能被删除。
优化后SQL:去掉不必要的表连接,改用顺序扫描 /*+ noindexscan(result) */ explain (ANALYZE true,buffers true) WITH A AS ( Xxxxxx
对应执行计划:
Aggregate (cost=489540.59..489540.60 rows=1 width=8) (actual time=1134.751..1134.751 rows=1 loops=1) Buffers: shared hit=472204, temp written=2667 CTE a -> Gather (cost=3.35..489266.83 rows=12167 width=4) (actual time=0.458..512.064 rows=1560819 loops=1) Workers Planned: 6 Workers Launched: 6 Buffers: shared hit=472204 -> Hash Join (cost=3.35..489266.83 rows=2028 width=4) (actual time=0.340..667.536 rows=222974 loops=7) Hash Cond: ((result.crowd_id)::text = (relate.relate_code)::text) Buffers: shared hit=472204 -> Parallel Seq Scan on zdr_arrive_local_cache result (cost=0.00..488458.18 rows=209339 width=12) (actual time=0.153..599.215 rows=227961 loops=7) Filter: ((crowd_name IS NOT NULL) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[]))) Rows Removed by Filter: 734658 Buffers: shared hit=471811 -> Hash (cost=3.33..3.33 rows=1 width=18) (actual time=0.069..0.069 rows=1 loops=7) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: shared hit=111 -> Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate (cost=0.14..3.33 rows=1 width=18) (actual time=0.066..0.067 rows=1 loops=7) Index Cond: (task_id = '76e7b381-1f86-47d7-a2f3-deac63c2e75f'::text) Heap Fetches: 1 Buffers: shared hit=111 -> CTE Scan on a (cost=0.00..243.34 rows=12167 width=4) (actual time=0.459..951.253 rows=1560819 loops=1) Buffers: shared hit=472204, temp written=2667 Planning time: 0.490 ms Execution time: 1179.083 ms |
性能提升2.35倍