测试环境信息:
OS version:
# more /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
# su - postgres
Last login: Sun Jun 6 04:02:39 CST 2021 on pts/0
postgresql version:
$ psql --version
psql (PostgreSQL) 13.3
$ exit
logout
pacemaker version:
# rpm -qa|grep pacemaker
pacemaker-libs-1.1.23-1.el7_9.1.x86_64
pacemaker-1.1.23-1.el7_9.1.x86_64
pacemaker-cli-1.1.23-1.el7_9.1.x86_64
pacemaker-cluster-libs-1.1.23-1.el7_9.1.x86_64
pcs version:
# rpm -qa|grep pcs
pcs-0.9.169-3.el7.centos.1.x86_64
#
一、安装基础软件
1.hostname[all servers]
[all servers]# hostnamectl set-hostname pg13vm01
cat << _EOF_ > /etc/hosts
127.0.0.1 localhost
192.168.43.96 pg13vm01
192.168.43.217 pg13vm02
192.168.43.185 pg13vm03
_EOF_
# more /etc/hosts
127.0.0.1 localhost
192.168.43.96 pg13vm01
192.168.43.217 pg13vm02
192.168.43.185 pg13vm03
#
2.firewalld and selinux[all servers]
# systemctl disable firewalld
# systemctl stop firewalld
# sestatus
SELinux status: disabled
#### or enable ####
systemctl --quiet --now enable firewalld
firewall-cmd --quiet --permanent --add-service=high-availability
firewall-cmd --quiet --permanent --add-service=postgresql
firewall-cmd --quiet --reload
#
3.pcs[all servers]
# yum install -y pacemaker pcs
# yum install autoconf automake libtool
# yum install docbook-style-xsl
# yum install gcc-c++ glib2-devel
4.pacemaker resource-agents 更新
# unzip resource-agents-4.8.0.zip
# cd resource-agents-4.8.0
# ./autogen.sh
# ./configure
# make
# make install
确认支持PG12以上版本
/usr/lib/ocf/resource.d/heartbeat/pgsql文件,1918 行,包含ocf_version_cmp "$version" "12"
5.pcsd[all servers]
# echo "hacluster" | passwd --stdin hacluster
6.postgresql[all servers]
#### yum.repo.list ####
# yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
#### install postgresql ####
# yum install -y postgresql13-server postgresql13-contrib
# rpm -qa|grep postgresql
postgresql13-server-13.3-1PGDG.rhel7.x86_64
postgresql13-13.3-1PGDG.rhel7.x86_64
postgresql13-contrib-13.3-1PGDG.rhel7.x86_64
postgresql13-libs-13.3-1PGDG.rhel7.x86_64
#
7.service setup
# systemctl disable postgresql-13.service
# systemctl disable corosync
# systemctl disable pacemaker
# systemctl enable pcsd.service
# systemctl start pcsd.service
8.cluster auth[any one host]
# pcs cluster auth pg13vm01 pg13vm02 pg13vm03 -u hacluster -p "hacluster"
pg13vm01: Authorized
pg13vm02: Authorized
pg13vm03: Authorized
#
9.设置数据库集群[any one host]
#### 配置集群节点 ####
# pcs cluster setup --name cluster_pg01 pg13vm01 pg13vm02 pg13vm03
Destroying cluster on nodes: pg13vm01, pg13vm02, pg13vm03...
pg13vm01: Stopping Cluster (pacemaker)...
pg13vm02: Stopping Cluster (pacemaker)...
pg13vm03: Stopping Cluster (pacemaker)...
pg13vm01: Successfully destroyed cluster
pg13vm03: Successfully destroyed cluster
pg13vm02: Successfully destroyed cluster
Sending 'pacemaker_remote authkey' to 'pg13vm01', 'pg13vm02', 'pg13vm03'
pg13vm01: successful distribution of the file 'pacemaker_remote authkey'
pg13vm02: successful distribution of the file 'pacemaker_remote authkey'
pg13vm03: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
pg13vm01: Succeeded
pg13vm02: Succeeded
pg13vm03: Succeeded
Synchronizing pcsd certificates on nodes pg13vm01, pg13vm02, pg13vm03...
pg13vm01: Success
pg13vm02: Success
pg13vm03: Success
Restarting pcsd on the nodes in order to reload the certificates...
pg13vm01: Success
pg13vm02: Success
pg13vm03: Success
#
#### 启动所有集群节点 ####
# pcs cluster start --all
pg13vm01: Starting Cluster (corosync)...
pg13vm02: Starting Cluster (corosync)...
pg13vm03: Starting Cluster (corosync)...
pg13vm03: Starting Cluster (pacemaker)...
pg13vm01: Starting Cluster (pacemaker)...
pg13vm02: Starting Cluster (pacemaker)...
#
#### 检查集群状态 ####
# pcs status --full
Cluster name: cluster_pg01
WARNINGS:
No stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: pg13vm01 (1) (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Sun Jun 6 00:11:57 2021
Last change: Sun Jun 6 00:11:48 2021 by hacluster via crmd on pg13vm01
3 nodes configured
0 resource instances configured
Online: [ pg13vm01 (1) pg13vm02 (2) pg13vm03 (3) ]
No resources
Node Attributes:
* Node pg13vm01 (1):
* Node pg13vm02 (2):
* Node pg13vm03 (3):
Migration Summary:
* Node pg13vm02 (2):
* Node pg13vm01 (1):
* Node pg13vm03 (3):
Fencing History:
PCSD Status:
pg13vm03: Online
pg13vm01: Online
pg13vm02: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
#
10.数据库目录配置[all servers]
# mkdir /pgdata
# chown postgres.postgres /pgdata/
# chmod 700 /pgdata/
11.数据库用户环境变量配置[all servers]
# su - postgres
$ vi .bash_profile
[ -f /etc/profile ] && source /etc/profile
export PGDATA=/pgdata
export PATH=/usr/pgsql-13/bin:$PATH
# If you want to customize your settings,
# Use the file below. This is not overridden
# by the RPMS.
[ -f /var/lib/pgsql/.pgsql_profile ] && source /var/lib/pgsql/.pgsql_profile
12.primary数据库配置[pg13vm01]
#### pg13vm01 ####
1).初始化数据库
# su - postgres
$ initdb -D /pgdata/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /pgdata ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D /pgdata/ -l logfile start
$
2).配置主机访问
$ vi pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 0.0.0.0/0 md5
# IPv6 local connections:
#host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication replic_user 192.168.43.0/24 md5
3).配置数据库参数
$ vi postgresql.conf
listen_addresses = '*'
wal_keep_size = 10240 # wal keep files size
4).创建复制用户
$ pg_ctl start
waiting for server to start....2021-06-06 00:28:55.991 CST [13488] LOG: redirecting log output to logging collector process
2021-06-06 00:28:55.991 CST [13488] HINT: Future log output will appear in directory "log".
done
server started
-bash-4.2$ psql
psql (13.3)
Type "help" for help.
postgres=# create user replic_user with replication password 'replic_user';
CREATE ROLE
postgres=# \du
List of roles
Role name | Attributes | Member of
-------------+------------------------------------------------------------+-----------
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
replic_user | Replication | {}
postgres=#
13.创建secondary数据库[pg13vm02,pg13vm03]
#### create secondary server ####
$ cd /pgdata/
$ pg_basebackup -h pg13vm01 -U replic_user -D /pgdata/ -P -v
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/A000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_13632"
24987/24987 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/A000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
$
14.停止primary数据库[pg13vm01]
$ pg_ctl stop
waiting for server to shut down.... done
server stopped
$
二、配置pacemaker数据库集群[pg13vm01]
1.检查集群状态
# pcs status
Cluster name: cluster_pg01
WARNINGS:
No stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: pg13vm01 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Sun Jun 6 00:53:57 2021
Last change: Sun Jun 6 00:11:48 2021 by hacluster via crmd on pg13vm01
3 nodes configured
0 resource instances configured
Online: [ pg13vm01 pg13vm02 pg13vm03 ]
No resources
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
#
2.创建集群文件
# pcs cluster cib cib.xml
# ll
total 4
-rw-r--r-- 1 root root 1672 Jun 6 00:52 cluster_pg13.xml
#
3.配置数据库资源
# property: cluster-name
pcs -f cib.xml property set cluster-name="pg_cluster_01"
# property: disable stonith, quorum
pcs -f cib.xml property set no-quorum-policy="ignore"
pcs -f cib.xml property set stonith-enabled="false"
# resource: master-vip
pcs -f cib.xml resource create master-vip ocf:heartbeat:IPaddr2 \
ip=192.168.43.201 cidr_netmask=24 nic=ens33 iflabel=master op monitor interval=5s
# resource: replica-vip
pcs -f cib.xml resource create replica-vip ocf:heartbeat:IPaddr2 \
ip=192.168.43.202 cidr_netmask=24 nic=ens33 iflabel=replica op monitor interval=5s
# resource: pgsql
pcs -f cib.xml resource create pgsql ocf:heartbeat:pgsql \
pgctl="/usr/pgsql-13/bin/pg_ctl" \
psql="/usr/pgsql-13/bin/psql" \
pgdata="/pgdata" \
node_list="pg13vm01 pg13vm02 pg13vm03" \
restore_command="" \
master_ip="192.168.43.201" \
repuser="replic_user" \
rep_mode="sync" \
primary_conninfo_opt="password=replic_user keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
op monitor interval="11s" \
op monitor interval="10s" role="Master" \
master master-max=1 master-node-max=1 clone-max=3 clone-node-max=1 notify=true target-role='Started'
# constraint: master-vip, pgsql on master node
pcs -f cib.xml constraint colocation add master-vip with master pgsql-master INFINITY
# constraint: pgsql promote node MasterGroup
pcs -f cib.xml constraint order promote pgsql-master then start master-vip symmetrical=false score=INFINITY
# constraint: pgsql demote node MasterGroup
pcs -f cib.xml constraint order demote pgsql-master then stop master-vip symmetrical=false score=0
# constraint: replica-vip sync standby 、sync standby on master
pcs -f cib.xml constraint location replica-vip rule score=200 pgsql-status eq HS:sync
pcs -f cib.xml constraint location replica-vip rule score=100 pgsql-status eq PRI
pcs -f cib.xml constraint location replica-vip rule score=-INFINITY not_defined pgsql-status
pcs -f cib.xml constraint location replica-vip rule score=-INFINITY pgsql-status ne HS:sync and pgsql-status ne PRI
# cluster: push cib file into cib
pcs cluster cib-push cib.xml
4.刷新集群状态
# pcs resource refresh --full
Waiting for 1 reply from the CRMd. OK
#
5.查询集群状态
# pcs status --full
Cluster name: cluster_pg01
Stack: corosync
Current DC: pg13vm01 (1) (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Sun Jun 6 04:02:24 2021
Last change: Sun Jun 6 04:02:08 2021 by root via crm_attribute on pg13vm01
3 nodes configured
5 resource instances configured
Online: [ pg13vm01 (1) pg13vm02 (2) pg13vm03 (3) ]
Full list of resources:
master-vip (ocf::heartbeat:IPaddr2): Started pg13vm01
replica-vip (ocf::heartbeat:IPaddr2): Started pg13vm03
Master/Slave Set: pgsql-master [pgsql]
pgsql (ocf::heartbeat:pgsql): Master pg13vm01
pgsql (ocf::heartbeat:pgsql): Slave pg13vm02
pgsql (ocf::heartbeat:pgsql): Slave pg13vm03
Masters: [ pg13vm01 ]
Slaves: [ pg13vm02 pg13vm03 ]
Node Attributes:
* Node pg13vm01 (1):
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 00000000050000D8
+ pgsql-status : PRI
* Node pg13vm02 (2):
+ master-pgsql : -INFINITY
+ pgsql-data-status : STREAMING|ASYNC
+ pgsql-status : HS:async
* Node pg13vm03 (3):
+ master-pgsql : 100
+ pgsql-data-status : STREAMING|SYNC
+ pgsql-status : HS:sync
+ pgsql-xlog-loc : 0000000009000060
Migration Summary:
* Node pg13vm01 (1):
* Node pg13vm02 (2):
* Node pg13vm03 (3):
Fencing History:
PCSD Status:
pg13vm03: Online
pg13vm01: Online
pg13vm02: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
#