patroni集群第一次初始化失败,再次使用原来的配置尝试初始化,查看信息,不同节点都显示同一个信息:waiting for leader to bootstrap
# systemctl status patroni.service ● patroni.service - PostgreSQL high-availability manager Loaded: loaded (/usr/lib/systemd/system/patroni.service; disabled; vendor preset: disabled) Active: active (running) since Sat 2020-02-22 17:41:50 CST; 11s ago Main PID: 3104 (python3.6) Tasks: 5 CGroup: /system.slice/patroni.service └─3104 python3.6 /opt/app/patroni/bin/patroni /opt/app/patroni/etc/postgresql.yml Feb 22 17:41:50 docker02 systemd[1]: Started PostgreSQL high-availability manager. Feb 22 17:41:50 docker02 systemd[1]: Starting PostgreSQL high-availability manager... Feb 22 17:41:50 docker02 patroni[3104]: 2020-02-22 17:41:50,916 INFO: Selected new etcd server http://11.11.11.250:2379 Feb 22 17:41:50 docker02 patroni[3104]: 2020-02-22 17:41:50,928 INFO: No PostgreSQL configuration items changed, nothing to reload. Feb 22 17:41:50 docker02 patroni[3104]: 2020-02-22 17:41:50,935 INFO: Lock owner: None; I am pg01 Feb 22 17:41:50 docker02 patroni[3104]: 2020-02-22 17:41:50,937 INFO: waiting for leader to bootstrap Feb 22 17:42:00 docker02 patroni[3104]: 2020-02-22 17:42:00,934 INFO: Lock owner: None; I am pg01 Feb 22 17:42:00 docker02 patroni[3104]: 2020-02-22 17:42:00,939 INFO: waiting for leader to bootstrap
原因:
etcd保留了集群的初始化信息(键:/service/$CLUSTER_NAME/initialize)。该键存在后,patroni就不会再次执行initdb。而是尝试再次引导已经存在的节点。
解决方法:
1.使用patronictl手动移除etcd中的/service/$CLUSTER_NAME/initialize
2.设置一个新的cluster name后从新引导
以下是手动移除的过程:
# patronictl -c /opt/app/patroni/etc/postgresql.yml list +---------+--------+--------------+------+---------+----+-----------+ | Cluster | Member | Host | Role | State | TL | Lag in MB | +---------+--------+--------------+------+---------+----+-----------+ | batman | pg01 | 11.11.11.111 | | stopped | | unknown | +---------+--------+--------------+------+---------+----+-----------+ # patronictl -c /opt/app/patroni/etc/postgresql.yml remove batman +---------+--------+--------------+------+---------+----+-----------+ | Cluster | Member | Host | Role | State | TL | Lag in MB | +---------+--------+--------------+------+---------+----+-----------+ | batman | pg01 | 11.11.11.111 | | stopped | | unknown | +---------+--------+--------------+------+---------+----+-----------+ Please confirm the cluster name to remove: batman You are about to remove all information in DCS for batman, please type: "Yes I am aware": Yes I am aware