数据库架构:一主两从
Master: 192.168.8.57
Slave1: 192.168.8.58
Slave2: 192.168.8.59
Manager:192.168.8.60
VIP: 192.168.8.88
MHA工具包:
mha4mysql-manager-0.58.tar.gz
mha4mysql-node-0.58.tar.gz
一、Master添加vip
/sbin/ifconfig enp0s3:1 192.168.8.88/24
ifconfig
二、failover自动切换测试
1. 手动停止Master的MySQL进程
mysqladmin -hlocalhost -uroot -pmysql shutdown
2. 查看Manager日志
[root@manager bin]# tail -f /var/log/masterha/app1/manager.log
Thu Oct 25 19:34:53 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Oct 25 19:34:53 2018 - [info] Executing SSH check script: exit 0
Thu Oct 25 19:34:53 2018 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s 192.168.8.58 -s 192.168.8.59 --user=root --master_host=192.168.8.57 --master_ip=192.168.8.57 --master_port=3306 --master_user=root --master_password=mysql --ping_type=SELECT
Thu Oct 25 19:34:53 2018 - [info] HealthCheck: SSH to 192.168.8.57 is reachable.
Monitoring server 192.168.8.58 is reachable, Master is not reachable from 192.168.8.58. OK.
Thu Oct 25 19:34:54 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.8.57‘ (111))
Thu Oct 25 19:34:54 2018 - [warning] Connection failed 2 time(s)..
Monitoring server 192.168.8.59 is reachable, Master is not reachable from 192.168.8.59. OK.
Thu Oct 25 19:34:54 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Thu Oct 25 19:34:55 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.8.57‘ (111))
Thu Oct 25 19:34:55 2018 - [warning] Connection failed 3 time(s)..
Thu Oct 25 19:34:56 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.8.57‘ (111))
Thu Oct 25 19:34:56 2018 - [warning] Connection failed 4 time(s)..
Thu Oct 25 19:34:56 2018 - [warning] Master is not reachable from health checker!
Thu Oct 25 19:34:56 2018 - [warning] Master 192.168.8.57(192.168.8.57:3306) is not reachable!
Thu Oct 25 19:34:56 2018 - [warning] SSH is reachable.
Thu Oct 25 19:34:56 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Thu Oct 25 19:34:56 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Oct 25 19:34:56 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Oct 25 19:34:56 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Oct 25 19:34:57 2018 - [info] GTID failover mode = 1
Thu Oct 25 19:34:57 2018 - [info] Dead Servers:
Thu Oct 25 19:34:57 2018 - [info] 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:57 2018 - [info] Alive Servers:
Thu Oct 25 19:34:57 2018 - [info] 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 19:34:57 2018 - [info] 192.168.8.59(192.168.8.59:3306)
Thu Oct 25 19:34:57 2018 - [info] Alive Slaves:
Thu Oct 25 19:34:57 2018 - [info] 192.168.8.58(192.168.8.58:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:57 2018 - [info] GTID ON
Thu Oct 25 19:34:57 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:57 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:57 2018 - [info] 192.168.8.59(192.168.8.59:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:57 2018 - [info] GTID ON
Thu Oct 25 19:34:57 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:57 2018 - [info] Checking slave configurations..
Thu Oct 25 19:34:57 2018 - [info] Checking replication filtering settings..
Thu Oct 25 19:34:57 2018 - [info] Replication filtering check ok.
Thu Oct 25 19:34:57 2018 - [info] Master is down!
Thu Oct 25 19:34:57 2018 - [info] Terminating monitoring script.
Thu Oct 25 19:34:57 2018 - [info] Got exit code 20 (Master dead).
Thu Oct 25 19:34:57 2018 - [info] MHA::MasterFailover version 0.58.
Thu Oct 25 19:34:57 2018 - [info] Starting master failover.
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:57 2018 - [info] * Phase 1: Configuration Check Phase..
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] GTID failover mode = 1
Thu Oct 25 19:34:58 2018 - [info] Dead Servers:
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Checking master reachability via MySQL(double check)...
Thu Oct 25 19:34:58 2018 - [info] ok.
Thu Oct 25 19:34:58 2018 - [info] Alive Servers:
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.59(192.168.8.59:3306)
Thu Oct 25 19:34:58 2018 - [info] Alive Slaves:
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.58(192.168.8.58:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.59(192.168.8.59:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Starting GTID based failover.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Oct 25 19:34:58 2018 - [info] Executing master IP deactivation script:
Thu Oct 25 19:34:58 2018 - [info] /usr/local/bin/master_ip_failover --orig_master_host=192.168.8.57 --orig_master_ip=192.168.8.57 --orig_master_port=3306 --command=stopssh --ssh_user=root
Thu Oct 25 19:34:58 2018 - [info] done.
Thu Oct 25 19:34:58 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Oct 25 19:34:58 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3: Master Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] The latest binary log file/position on all slaves is mysql-bin.000010:707
Thu Oct 25 19:34:58 2018 - [info] Retrieved Gtid Set: a92f70a4-d5ea-11e8-af28-080027c0450d:7-9
Thu Oct 25 19:34:58 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.58(192.168.8.58:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.59(192.168.8.59:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] The oldest binary log file/position on all slaves is mysql-bin.000010:707
Thu Oct 25 19:34:58 2018 - [info] Retrieved Gtid Set: a92f70a4-d5ea-11e8-af28-080027c0450d:7-9
Thu Oct 25 19:34:58 2018 - [info] Oldest slaves:
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.58(192.168.8.58:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.59(192.168.8.59:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.3: Determining New Master Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Searching new master from slaves..
Thu Oct 25 19:34:58 2018 - [info] Candidate masters from the configuration file:
Thu Oct 25 19:34:58 2018 - [info] 192.168.8.58(192.168.8.58:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info] GTID ON
Thu Oct 25 19:34:58 2018 - [info] Replicating from 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 19:34:58 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info] Non-candidate masters:
Thu Oct 25 19:34:58 2018 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Thu Oct 25 19:34:58 2018 - [info] New master is 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 19:34:58 2018 - [info] Starting master failover..
Thu Oct 25 19:34:58 2018 - [info]
From:
192.168.8.57(192.168.8.57:3306) (current master)
+--192.168.8.58(192.168.8.58:3306)
+--192.168.8.59(192.168.8.59:3306)
To:
192.168.8.58(192.168.8.58:3306) (new master)
+--192.168.8.59(192.168.8.59:3306)
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Waiting all logs to be applied..
Thu Oct 25 19:34:58 2018 - [info] done.
Thu Oct 25 19:34:58 2018 - [info] Getting new master‘s binlog name and position..
Thu Oct 25 19:34:58 2018 - [info] mysql-bin.000010:747
Thu Oct 25 19:34:58 2018 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST=‘192.168.8.58‘, MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER=‘repl‘, MASTER_PASSWORD=‘xxx‘;
Thu Oct 25 19:34:58 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000010, 747, a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
a92f70a4-d5ea-11e8-af28-080027c0450f:1-4
Thu Oct 25 19:34:58 2018 - [info] Executing master IP activate script:
Thu Oct 25 19:34:58 2018 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.8.57 --orig_master_ip=192.168.8.57 --orig_master_port=3306 --new_master_host=192.168.8.58 --new_master_ip=192.168.8.58 --new_master_port=3306 --new_master_user=‘root‘ --new_master_password=xxx
Undefined subroutine &main::FIXME_xxx_create_user called at /usr/local/bin/master_ip_failover line 94.
Set read_only=0 on the new master.
Creating app user on the new master..
Thu Oct 25 19:34:58 2018 - [error][/usr/lib/perl5/vendor_perl/MHA/MasterFailover.pm, ln1612] Failed to activate master IP address for 192.168.8.58(192.168.8.58:3306) with return code 10:0
Thu Oct 25 19:34:58 2018 - [warning] Proceeding.
Thu Oct 25 19:34:58 2018 - [info] ** Finished master recovery successfully.
Thu Oct 25 19:34:58 2018 - [info] * Phase 3: Master Recovery Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 4: Slaves Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] -- Slave recovery on host 192.168.8.59(192.168.8.59:3306) started, pid: 20757. Check tmp log /var/log/masterha/app1/192.168.8.59_3306_20181025193457.log if it takes time..
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] Log messages from 192.168.8.59 ...
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Resetting slave 192.168.8.59(192.168.8.59:3306) and starting replication from the new master 192.168.8.58(192.168.8.58:3306)..
Thu Oct 25 19:34:58 2018 - [info] Executed CHANGE MASTER.
Thu Oct 25 19:34:58 2018 - [info] Slave started.
Thu Oct 25 19:34:58 2018 - [info] gtid_wait(a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
a92f70a4-d5ea-11e8-af28-080027c0450f:1-4) completed on 192.168.8.59(192.168.8.59:3306). Executed 0 events.
Thu Oct 25 19:34:59 2018 - [info] End of log messages from 192.168.8.59.
Thu Oct 25 19:34:59 2018 - [info] -- Slave on host 192.168.8.59(192.168.8.59:3306) started.
Thu Oct 25 19:34:59 2018 - [info] All new slave servers recovered successfully.
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] * Phase 5: New master cleanup phase..
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] Resetting slave info on the new master..
Thu Oct 25 19:35:00 2018 - [info] 192.168.8.58: Resetting slave info succeeded.
Thu Oct 25 19:35:00 2018 - [info] Master failover to 192.168.8.58(192.168.8.58:3306) completed successfully.
Thu Oct 25 19:35:00 2018 - [info] Deleted server1 entry from /etc/masterha/app1.cnf .
Thu Oct 25 19:35:00 2018 - [info]
----- Failover Report -----
app1: MySQL Master failover 192.168.8.57(192.168.8.57:3306) to 192.168.8.58(192.168.8.58:3306) succeeded
Master 192.168.8.57(192.168.8.57:3306) is down!
Check MHA Manager logs at manager:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.8.57(192.168.8.57:3306)
Selected 192.168.8.58(192.168.8.58:3306) as a new master.
192.168.8.58(192.168.8.58:3306): OK: Applying all logs succeeded.
Failed to activate master IP address for 192.168.8.58(192.168.8.58:3306) with return code 10:0
192.168.8.59(192.168.8.59:3306): OK: Slave started, replicating from 192.168.8.58(192.168.8.58:3306)
192.168.8.58(192.168.8.58:3306): Resetting slave info succeeded.
Master failover to 192.168.8.58(192.168.8.58:3306) completed successfully.
Thu Oct 25 19:35:00 2018 - [info] Sending mail..
日志中显示主库已经切换,新的主库为192.168.8.58
3. 查看Slave主机192.168.8.58和192.168.8.59的信息
192.168.8.58
mysql> show slave status \G
Empty set (0.00 sec)
mysql> show master status \G
*************************** 1. row ***************************
File: mysql-bin.000010
Position: 747
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
a92f70a4-d5ea-11e8-af28-080027c0450f:1-4
1 row in set (0.00 sec)
mysql> show variables like ‘read_only‘;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | OFF |
+---------------+-------+
192.168.8.59
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.58
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000010
Read_Master_Log_Pos: 747
Relay_Log_File: slave2-relay-bin.000002
Relay_Log_Pos: 414
Relay_Master_Log_File: mysql-bin.000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
mysql> show variables like ‘read_only‘;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | ON |
+---------------+-------+
可以看到192.168.8.58变成新主库,read_only变成了OFF,192.168.8.59作为192.168.8.58的从库,read_only依然为ON。
4. 主从数据测试
在192.168.8.58创建表t7
mysql> create table t7(id int(6));
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
| t6 |
| t7 |
+----------------+
在192.168.8.59进行查看
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
| t6 |
| t7 |
+----------------+
可以看到新的主从复制正常。
5. 切换完之后发现MHA Manager监控程序会自动死掉,官网有如下解释和解决方式:
Running MHA Manager from daemontools
Currently MHA Manager process does not run as a daemon. if failover completed
successfully or the master process was killed by accident, the manager stops
working. to run as a daemon, daemontool. or any external daemon program
can be used. Here is an example to run from daemontools.
解决方法:
vim /usr/local/bin/manager_status_check
#!/bin/bash
while :
do
MGECHECK=`ps -ef |grep masterha_manager |egrep -v grep| wc -l`
if [ $MGECHECK -eq 0 ];then
/usr/local/bin/masterha_start.sh
else
echo "MHA manager start"
fi
sleep 5
done
chmod u+x /usr/local/bin/manager_status_check
nohup /usr/local/bin/manager_status_check &
写入/etc/rc.d/rc.local开机自动启动
echo "nohup /usr/local/bin/manager_status_check &" >> /etc/rc.d/rc.local
6. 原主库加入MHA
修改旧主库192.168.8.57的参数my.cnf,打开从库相关的参数,要重启.
方法一:由于有GTID,可以直接就change master切换过去
对比一下数据
192.168.8.57
mysql> use test
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
| t6 |
+----------------+
192.168.8.58
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
| t6 |
| t7 |
+----------------+
旧主库直接change master to
mysql> change master to master_host=‘192.168.8.58‘,master_port=3306,master_user=‘repl‘,master_password=‘mysql‘,master_auto_position=1;
查看192.168.8.57 Slave进程状态
mysql> start slave;
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.58
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000010
Read_Master_Log_Pos: 912
Relay_Log_File: master-relay-bin.000007
Relay_Log_Pos: 619
Relay_Master_Log_File: mysql-bin.000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
可以看到slave进程已经启动,新的主库为192.168.8.58
设置192.168.8.57参数read_only=1
mysql> set global read_only=1;
mysql> show variables like ‘read_only‘;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | ON |
+---------------+-------+
查看192.168.8.57数据复制情况
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
| t6 |
| t7 |
+----------------+
可以看到此时t7已经复制成功。
方法二:
若修复原主库耗费时间较长,建议重新初始化192.168.8.57从库,而非使用change master to
7. 修改MHA配置文件
此时发现MHA配置文件app1.cnf出现缺失
vim /etc/masterha/app1.cnf
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/data/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=mysql
ping_interval=1
remote_workdir=/tmp
repl_password=mysql
repl_user=repl
report_script=/usr/local/bin/send_report
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.8.58 -s 192.168.8.59
shutdown_script=""
ssh_user=root
user=root
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.8.58
port=3306
[server3]
hostname=192.168.8.59
port=3306
完善之后如下:
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/data/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=mysql
ping_interval=1
remote_workdir=/tmp
repl_password=mysql
repl_user=repl
report_script=/usr/local/bin/send_report
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.8.57 -s 192.168.8.59
shutdown_script=""
ssh_user=root
user=root
[server1]
candidate_master=1
check_repl_delay=0
hostname=192.168.8.57
port=3306
[server2]
hostname=192.168.8.58
port=3306
[server3]
hostname=192.168.8.59
port=3306
8.重启监控程序
MHA复制健康检查
./masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Oct 25 20:32:04 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
.............................................................................
MySQL Replication Health is OK.
启动MHA监控程序
./masterha_start.sh
查看Manager日志
# tail -f /var/log/masterha/app1/manager.log
Thu Oct 25 20:33:07 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Oct 25 20:33:07 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Oct 25 20:33:07 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Oct 25 20:33:07 2018 - [info] MHA::MasterMonitor version 0.58.
Thu Oct 25 20:33:09 2018 - [info] GTID failover mode = 1
Thu Oct 25 20:33:09 2018 - [info] Dead Servers:
Thu Oct 25 20:33:09 2018 - [info] Alive Servers:
Thu Oct 25 20:33:09 2018 - [info] 192.168.8.57(192.168.8.57:3306)
Thu Oct 25 20:33:09 2018 - [info] 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 20:33:09 2018 - [info] 192.168.8.59(192.168.8.59:3306)
Thu Oct 25 20:33:09 2018 - [info] Alive Slaves:
Thu Oct 25 20:33:09 2018 - [info] 192.168.8.57(192.168.8.57:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 20:33:09 2018 - [info] GTID ON
Thu Oct 25 20:33:09 2018 - [info] Replicating from 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 20:33:09 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 20:33:09 2018 - [info] 192.168.8.59(192.168.8.59:3306) Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 20:33:09 2018 - [info] GTID ON
Thu Oct 25 20:33:09 2018 - [info] Replicating from 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 20:33:09 2018 - [info] Current Alive Master: 192.168.8.58(192.168.8.58:3306)
Thu Oct 25 20:33:09 2018 - [info] Checking slave configurations..
Thu Oct 25 20:33:09 2018 - [info] Checking replication filtering settings..
Thu Oct 25 20:33:09 2018 - [info] binlog_do_db= , binlog_ignore_db=
Thu Oct 25 20:33:09 2018 - [info] Replication filtering check ok.
Thu Oct 25 20:33:09 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu Oct 25 20:33:09 2018 - [info] Checking SSH publickey authentication settings on the current master..
Thu Oct 25 20:33:09 2018 - [info] HealthCheck: SSH to 192.168.8.58 is reachable.
Thu Oct 25 20:33:09 2018 - [info]
192.168.8.58(192.168.8.58:3306) (current master)
+--192.168.8.57(192.168.8.57:3306)
+--192.168.8.59(192.168.8.59:3306)
Thu Oct 25 20:33:09 2018 - [info] Checking master_ip_failover_script status:
Thu Oct 25 20:33:09 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.8.58 --orig_master_ip=192.168.8.58 --orig_master_port=3306
Thu Oct 25 20:33:09 2018 - [info] OK.
Thu Oct 25 20:33:09 2018 - [warning] shutdown_script is not defined.
Thu Oct 25 20:33:09 2018 - [info] Set master ping interval 1 seconds.
Thu Oct 25 20:33:09 2018 - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s 192.168.8.57 -s 192.168.8.59
Thu Oct 25 20:33:09 2018 - [info] Starting ping health check on 192.168.8.58(192.168.8.58:3306)..
Thu Oct 25 20:33:09 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn‘t respond..
至此,自动切换及恢复工作测试完毕。
链接:http://blog.itpub.net/30135314/viewspace-2217577/
MySQL高可用架构之MHA
https://www.cnblogs.com/gomysql/p/3675429.html
https://www.cloudbility.com/club/7104.html