现象:
今天折腾MHA时遇到一个问题,MHA可以正常切换主库和VIP地址,但业务程序 尝试用VIP地址连接MySQL数据库时,提示无法连接数据库。
[root@mysqldb01 ~]# mysql -usystem -pwelcome1 -h192.168.56.100 mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.56.100' (92) |
处理过程:
- 环境及故障说明
(1).一主两从的GTID复制环境,当前131为主,132和133为从。
root@localhost [(none)]>show slave hosts; +-----------+-----------+------+-----------+--------------------------------------+ | Server_id | Host | Port | Master_id | Slave_UUID | +-----------+-----------+------+-----------+--------------------------------------+ | 133 | mysqldb03 | 3306 | 131 | 00e51b26-0dd1-11ec-8fc6-0800278905f1 | | 132 | mysqldb02 | 3306 | 131 | c0c201cf-0dd0-11ec-8f61-080027101452 | +-----------+-----------+------+-----------+--------------------------------------+ |
(2). 在manager节点上执行以下命令来启动 MHA守护进程,检查MHA状态正常。
[root@mysqldb05 app1]# nohup masterha_manager -conf=/etc/masterha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover &> /etc/masterha/app1/manager.log &
[root@mysqldb05 app1]# masterha_check_status -conf=/etc/masterha/app1/app1.cnf app1 (pid:19483) is running(0:PING_OK), master:192.168.56.131 |
(3).模拟131主库故障,让MHA执行自动切换工作,切换后的状态。
[root@mysqldb02 ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:10:14:52 brd ff:ff:ff:ff:ff:ff inet 192.168.56.132/24 brd 192.168.56.255 scope global noprefixroute enp0s3 valid_lft forever preferred_lft forever inet 192.168.56.100/24 brd 192.168.56.255 scope global secondary enp0s3:1 valid_lft forever preferred_lft forever inet6 fe80::ec22:b60d:dd41:feb3/64 scope link noprefixroute valid_lft forever preferred_lft forever |
MHA已经将主库自动切换到132上,同时可以看到VIP(192.168.56.100)也已经切换到132主机上。
(4).模拟业务程序连接数据库
[root@mysqldb01 ~]# mysql -usystem -pwelcome1 -h192.168.56.100 mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.56.100' (92)
[root@mysqldb02 ~]# mysql -usystem -pwelcome1 -h192.168.56.100 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 16 Server version: 5.7.26-log MySQL Community Server (GPL)
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
system@192.168.56.100 [(none)]> |
2、从故障重现可以看出,在VIP所在的主机上可以正常连接数据库,这说明数据库是可以正常提供服务的,但在其他主机无法连接,错误的状态码为92。
3、在其他主机上telnet VIP地址的3306端口
[root@mysqldb01 ~]# telnet 192.168.56.100 3306 Trying 192.168.56.100... telnet: connect to address 192.168.56.100: Protocol not available
[root@mysqldb02 ~]# telnet 192.168.56.100 3306 Trying 192.168.56.100... Connected to 192.168.56.100. Escape character is '^]'. N u&4mysql_native_password |
可以看出,其他 的主机访问VIP的3306端口时就已经出现问题,所以MySQL无法连接。
4、在google上搜索了大量的资料,没有找到有帮助的案例。这个问题的根本原因是VIP地址这块,通过MHA的master_ip_failover脚本切换VIP地址后,为什么无法telnet 3306这个端口?
5、分析master_ip_failover脚本中切换VIP地址的这块代码
my $vip = "192.168.56.100"; my $interface = "enp0s3"; my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip/24"; my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down"; my $ssh_send_garp = "/sbin/arping -U $vip -I $interface -c 1"; |
可以看出,脚本中是通过调用ifconfig enp0s3:1 192.168.56.100/24命令启动VIP地址。我们知道,除了ifconfig命令,还可以使用ip add命令来添加VIP地址。
6、修改master_ip_failover脚本
my $ssh_start_vip = "ip addr add $vip/24 dev $interface"; my $ssh_stop_vip = "ip addr delete $vip/24 dev $interface"; |
将启动VIP地址的命令更换为ip addr add。
7、再次进行多次MHA切换测试,一切正常,所有主机都可以通过VIP连接数据库。
8、在网上搜了搜 ifconfig 添加VIP 和 ip add添加VIP的区别,有人甚至从源码上进行分析得出的结论是两者没区别,但是从这个案例可以看出,两者之间绝对是有区别的,还需要深入学习。