我们知道PostgreSQL数据库中有两大块buffer:wal buffer和shared buffer。两者作用分别是:
- wal buffer:预写日志缓冲区,wal writer进程负责将wal buffer缓冲区的数据写入WAL FILE中。
- shared buffer:数据页缓冲区,background writer进程负责将shared buffer缓冲区的数据写入DATA
FILE中。
当写入量非常大时,wal writer和background writer进程不足以满足系统负载时,用户进程(backend process)也会参与将BUFFER写到FILE的工作。
如何监控数据库的磁盘使用情况呢?我们可以通过操作系统的监控程序,如iotop、iostat等。除此之外,pg中提供了系统视图统计bgwriter,walwriter,backend process 写磁盘的实时指标。
pg_stat_bgwriter 视图的介绍:
bill=# \d pg_stat_bgwriter
View "pg_catalog.pg_stat_bgwriter"
Column | Type | Collation | Nullable | Default
-----------------------+--------------------------+-----------+----------+---------
checkpoints_timed | bigint | | |
checkpoints_req | bigint | | |
checkpoint_write_time | double precision | | |
checkpoint_sync_time | double precision | | |
buffers_checkpoint | bigint | | |
buffers_clean | bigint | | |
maxwritten_clean | bigint | | |
buffers_backend | bigint | | |
buffers_backend_fsync | bigint | | |
buffers_alloc | bigint | | |
stats_reset | timestamp with time zone | | |
1、pg_stat_bgwriter.buffers_clean bgwriter 每秒write多少shared buffer到disk(write-异步系统调用) , 说明了脏页产生较快,但是不影响业务。
2、pg_stat_bgwriter.buffers_backend backend process 每秒wirte多少shared buffer到disk(write-异步系统调用) , 说明产生脏页较快,并且bgwriter或checkpointer写脏页已经赶不上产生脏页的速度了,对业务开始有影响。
3、walwriter 每秒write多少wal buffer到disk(write-异步系统调用)
4、pg_stat_bgwriter.buffers_alloc 每秒分配多少新的shared buffer,说明了从磁盘读的频繁程度。
例子:
构建压测环境:
pgbench -i -s 100
pgbench -M prepared -n -r -P 1 -c 16 -j 16 -T 10000
监测:
1、bgwriter 每秒write多少shared buffer到disk(write-异步系统调用)
postgres@postgres=>select buffers_clean*8/1024||' MB' bg from pg_stat_bgwriter;
bg
----------
25911 MB
(1 row)
postgres@postgres=>\watch 1
Sun 19 Apr 2020 05:16:03 PM CST (every 1s)
bg
----------
26120 MB
(1 row)
Sun 19 Apr 2020 05:16:04 PM CST (every 1s)
bg
----------
26184 MB
(1 row)
Sun 19 Apr 2020 05:16:05 PM CST (every 1s)
bg
----------
26234 MB
(1 row)
Sun 19 Apr 2020 05:16:06 PM CST (every 1s)
bg
----------
26289 MB
(1 row)
2、backend process 每秒wirte多少shared buffer到disk(write-异步系统调用)
postgres@postgres=>select buffers_clean*8/1024||' MB' bg from pg_stat_bgwriter;
bg
----------
26533 MB
(1 row)
postgres@postgres=>\watch 1
Sun 19 Apr 2020 05:16:13 PM CST (every 1s)
bg
----------
26756 MB
(1 row)
Sun 19 Apr 2020 05:16:14 PM CST (every 1s)
bg
----------
26825 MB
(1 row)
Sun 19 Apr 2020 05:16:15 PM CST (every 1s)
bg
----------
26883 MB
(1 row)
Sun 19 Apr 2020 05:16:16 PM CST (every 1s)
bg
----------
26946 MB
(1 row)
3、walwriter 每秒write多少wal buffer到disk(write-异步系统调用)
postgres@postgres=>with a as (select pg_current_wal_lsn() lsn) select pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), lsn)) from a, pg_sleep(1);
pg_size_pretty
----------------
12 MB
(1 row)
postgres@postgres=>\watch 0.001
Sun 19 Apr 2020 05:16:40 PM CST (every 0.001s)
pg_size_pretty
----------------
112 MB
(1 row)
Sun 19 Apr 2020 05:16:41 PM CST (every 0.001s)
pg_size_pretty
----------------
96 MB
(1 row)
Sun 19 Apr 2020 05:16:42 PM CST (every 0.001s)
pg_size_pretty
----------------
104 MB
(1 row)
4、每秒分配多少新的shared buffer
postgres@postgres=>select buffers_alloc*8/1024||' MB' bg from pg_stat_bgwriter;
bg
----------
43668 MB
(1 row)
postgres@postgres=>\watch 1
Sun 19 Apr 2020 05:16:55 PM CST (every 1s)
bg
----------
43676 MB
(1 row)
Sun 19 Apr 2020 05:16:56 PM CST (every 1s)
bg
----------
43676 MB
(1 row)
Sun 19 Apr 2020 05:16:57 PM CST (every 1s)
bg
----------
43676 MB
(1 row)
iotop监控情况:
经验证和前面的方法基本差不多。
[root@oracle ~]# iotop
Total DISK READ : 0.00 B/s | Total DISK WRITE : 60.69 M/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 21.32 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
27511 be/4 pg12 0.00 B/s 21.46 M/s 0.00 % 26.74 % postgres: walwriter
9773 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.09 % [kworker/0:0]
28619 be/4 pg12 0.00 B/s 7.51 K/s 0.00 % 0.01 % postgres: postgres postgres [local] UPDATE
28622 be/4 pg12 0.00 B/s 165.11 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28628 be/4 pg12 0.00 B/s 22.52 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28620 be/4 pg12 0.00 B/s 15.01 K/s 0.00 % 0.00 % postgres: postgres postgres [local] BIND
27510 be/4 pg12 0.00 B/s 37.39 M/s 0.00 % 0.00 % postgres: background writer
28607 be/4 pg12 0.00 B/s 15.01 K/s 0.00 % 0.00 % postgres: postgres postgres [local] UPDATE
28611 be/4 pg12 0.00 B/s 315.21 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28612 be/4 pg12 0.00 B/s 172.62 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28615 be/4 pg12 0.00 B/s 7.51 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28624 be/4 pg12 0.00 B/s 15.01 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28625 be/4 pg12 0.00 B/s 315.21 K/s 0.00 % 0.00 % postgres: postgres postgres [local] UPDATE
28626 be/4 pg12 0.00 B/s 30.02 K/s 0.00 % 0.00 % postgres: postgres postgres [local] INSERT
28627 be/4 pg12 0.00 B/s 22.52 K/s 0.00 % 0.00 % postgres: postgres postgres [local] idle in transaction
28629 be/4 pg12 0.00 B/s 780.53 K/s 0.00 % 0.00 % postgres: postgres postgres [local] UPDATE
除此之外,还可以自定义监测某个时间段内的wal写入量。
- pg_current_wal_lsn(),查看当前的WAL LSN位点。
- pg_wal_lsn_diff(lsn,lsn),计算两个LSN位点之间有多少字节。
postgres@postgres=>select pg_current_wal_lsn();
pg_current_wal_lsn
--------------------
18/D6430000
(1 row)
postgres@postgres=>select pg_current_wal_lsn();
pg_current_wal_lsn
--------------------
18/EA57A6D0
(1 row)
postgres@postgres=>select pg_size_pretty(pg_wal_lsn_diff('18/D6430000','18/EA57A6D0'));
pg_size_pretty
----------------
-321 MB
(1 row)
参考链接:
https://www.postgresql.org/docs/12/monitoring-stats.html#MONITORING-STATS-VIEWS