一、Keepalived+MySQL Replication的应用场景
MySQL的高可用方案有cluster,MMM,MHA等,这些高可用方案都要三台服务器以上,成本有点高,今天介绍一个低成本高可用方案:MySQL Replication+keepalive这个方案只需要两台服务器,适合小型企业
二、Keepalived双机热备的优缺点
MySQL Replication 架构为master/master,其中一台设置为只读(read_only),当master1故障时,借助keepalive将vip漂移到master2上。继续提供服务。
优点:数据同步非常简单,实现起来相对复杂维护难度低
缺点:MySQL复制是异步的(虽然后面有了半同步),如果主库压力比较大,从库SQL线程目前MySQL的版本是单线程的,延时会比较大,因为同步延时大,当VIP发生切换时候,这个时候两台服务器的数据数据不一致。不过可以借助脚本,比如等SQL线程应用完中继日志以后再关闭read_only提供写服务。然后sync_binlog,innodb_flush_log_at_trx_commit都设置为1,日常监控复制延时,可以尽量避免这个情况,还有一个就是备机处于闲置状态。
三、配置MySQL双主
1.在maseter1里面配置复制帐号以及开启二进制日志,配置日志刷盘相关参数,并且做个全备
在my.cnf:[mysqld]段加上这几个参数
log_bin=dg6-log-bin
log_slave_updates=1
sync_binlog=1
server_id =11
innodb_flush_log_at_trx_commit=1
grant replication slave,replication client on *.* to repl@'192.168.80.%' identified by 'repl';
mysqldump -uroot -p1qaz2wsx -A -R --triggers --single-transaction --opt --master-data=2 > all.sql
如果是已经在跑的MySQL库,而且库很大,建议用Xtrabackup用流方式备份并传输到备库
2.在master2上配置复制相关参数
log_bin=dg7-log-bin
log_slave_updates=1
sync_binlog=1
server_id =12
innodb_flush_log_at_trx_commit=1
3.在master2上恢复备份,并且设置主从复制,复制信息已经在备份里面已经有了,根据这个信息设置主从复制。
[root@dg7 ~]# cat all.sql | grep ^"--" | grep -i change
-- CHANGE MASTER TO MASTER_LOG_FILE='dg6-logbin.000001', MASTER_LOG_POS=2438;
mysql> CHANGE MASTER TO MASTER_LOG_FILE='dg6-logbin.000001', MASTER_LOG_POS=2438,MASTER_HOST='192.168.80.106',MASTER_USER='repl',MASTER_PASSWORD='repl';
Query OK, 0 rows affected (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.80.106
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: dg6-logbin.000001
Read_Master_Log_Pos: 2438
Relay_Log_File: dg7-relay-bin.000002
Relay_Log_Pos: 254
Relay_Master_Log_File: dg6-logbin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 2438
Relay_Log_Space: 408
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
mysql>
4.在master1上配置复制master2上面的信息
因为开启log_slave_updates参数,应用中继日志以后,也会写到本地的二进制日志,所以Position会不停的变动,不知道哪个为准,其实用哪个Position都没关系,master1都不会去应用,因为复制的SQL线程不会去应用serverid是自己本身的中继日志,所以只要在master2上show一把就行了。
mysql> show master status\G
*************************** 1. row ***************************
File: dg7-logbin.000001
Position: 542168
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
这就是我们需要的信息
mysql> CHANGE MASTER TO MASTER_LOG_FILE=' dg7-logbin.000001', MASTER_LOG_POS=542168,MASTER_HOST='192.168.80.107',MASTER_USER='repl',MASTER_PASSWORD='repl';
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.80.107
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: dg7-logbin.000001
Read_Master_Log_Pos: 542168
Relay_Log_File: dg6-relay-bin.000002
Relay_Log_Pos: 254
Relay_Master_Log_File: dg7-logbin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 542168
Relay_Log_Space: 408
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 11
1 row in set (0.00 sec)
四、 安装配置keepalvie
1.wget http://www.keepalived.org/software/keepalived-1.2.13.tar.gz
[root@dg6 ~]# tar xf keepalived-1.2.13.tar.gz
[root@dg6 ~]# cd keepalived-1.2.13
[root@dg6 keepalived-1.2.13]# ./configure --prefix=/usr/local/keepalive
checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking for a BSD-compatible install... /usr/bin/install -c checking for strip... strip checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX. compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking fcntl.h usability... yes checking fcntl.h presence... yes checking for fcntl.h... yes checking syslog.h usability... yes checking syslog.h presence... yes checking for syslog.h... yes checking for unistd.h... (cached) yes checking sys/ioctl.h usability... yes checking sys/ioctl.h presence... yes checking for sys/ioctl.h... yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking openssl/ssl.h usability... no checking openssl/ssl.h presence... no checking for openssl/ssl.h... no configure: error: !!! OpenSSL is not properly installed on your system. !!! !!! Can not include OpenSSL headers files. !!! 缺少openssl-devel包
[root@dg6 keepalived-1.2.13]# yum install openssl-devel
[root@dg6 keepalived-1.2.13]#./configure --prefix=/usr/local/keepalive --with-kernel-dir=/usr/src/kernels/2.6.32-504.el6.x86_64/
.............................................................................................................
编译过程略
.............................................................................................................
Keepalived configuration
------------------------
Keepalived version : 1.2.13
Compiler : gcc
Compiler flags : -g -O2
Extra Lib : -lssl -lcrypto -lcrypt
Use IPVS Framework : Yes
IPVS sync daemon support : Yes
IPVS use libnl : No
fwmark socket support : Yes
Use VRRP Framework : Yes
Use VRRP VMAC : Yes
SNMP support : No
SHA1 support : No
Use Debug flags : No
[root@dg6 keepalived-1.2.13]#make && make install
[root@dg6 keepalived-1.2.13]#cp -rpf /usr/local/keepalive/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/
[root@dg6 keepalived-1.2.13]#cp -rpf /usr/local/keepalive/etc/sysconfig/keepalived /etc/sysconfig/
[root@dg6 ~]# vim /etc/sysconfig/keepalived
# --vrrp -P Only run with VRRP subsystem.
# --check -C Only run with Health-checker subsystem.
# --dont-release-vrrp -V Dont remove VRRP VIPs & VROUTEs on daemon stop.
# --dont-release-ipvs -I Dont remove IPVS topology on daemon stop.
# --dump-conf -d Dump the configuration data.
# --log-detail -D Detailed log messages.
# --log-facility -S 0-7 Set local syslog facility (default=LOG_DAEMON)
#
#### -S 0 指日志级别 0-7个日志级别请自己脑补 -d 指定配置文件
KEEPALIVED_OPTIONS="-D -d -S 0 -f /usr/local/keepalive/ha.conf"
单独生产一个keepalvie日志方便查看
[root@dg6 keepalived-1.2.13]# vim /etc/rsyslog.conf
添加一行内容,因为keepalvie设置的日志级别为0,所以设置local0.* /var/log/keepalive.log
local0.* /var/log/keepalive.log
[root@dg6 ~]# cd /usr/local/keepalive/
现在关键一步来了,配置keepalive的配置文件,keepalive健康检测有很多方法,我这边健康居于端口检测,
ha.conf ##############master ! Configuration File for keepalived global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout router_id ### 名称标识 } # VIP_1 vrrp_instance VI_1 { state BACKUP interface eth0 ###监听VRRP协议的网卡,通俗的讲,跟你VIP同个网段的IP在那个网卡上,你就侦听哪个网卡 virtual_router_id ####虚拟路由ID,两台必须一样 priority advert_int nopreempt ##双 backup 设置高优先级非抢占模型 ,就高优先级恢复后不抢夺VIP authentication { auth_type PASS auth_pass } #######定义keepalvie已经角色变化后执行的脚本
##########backup机定义角色变成master后执行的脚本, notify_master /usr/local/keepalive/master_master.sh ##########backup机定义角色变成backup后执行的脚本, notify_backup /usr/local/keepalive/backup.sh ###定义VIP virtual_ipaddress { 192.168.80.80 } } virtual_server 192.168.80.80 { delay_loop lb_algo rr lb_kind DR nat_mask 255.255.255.0 persistence_timeout protocol TCP ################### 定义real_server real_server 192.168.80.106 { ##########定义端口检测失败的脚本down 脚本 notify_down /usr/local/keepalive/stopkp.sh TCP_CHECK { connect_timeout connect_port } } } ##############backup ! Configuration File for keepalived global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout router_id ### 名称标识 } # VIP_1 vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id ####虚拟路由ID,两台必须一样 priority #####backup机优先级要低于master advert_int nopreempt ##双 backup 设置高优先级非抢占模型 ,高优先级恢复后不抢夺VIP, authentication { auth_type PASS auth_pass } #######定义keepalive 已经角色变化后执行的脚本
##########backup机定义角色变成master后执行的脚本, notify_master /usr/local/keepalive/backup_master.sh ##########backup机定义角色变成backup后执行的脚本, notify_backup /usr/local/keepalive/backup.sh ###定义VIP virtual_ipaddress { 192.168.80.80 } } virtual_server 192.168.80.80 { delay_loop lb_algo rr lb_kind DR nat_mask 255.255.255.0 persistence_timeout protocol TCP ################### 定义real_server real_server 192.168.80.107 {
##########定义端口检测失败的脚本down 脚本 notify_down /usr/local/keepalive/stopkp.sh TCP_CHECK { connect_timeout connect_port } } } 关键参数解释 notify_master:状态改变为master以后执行的脚本。 notify_backup: 状态改变为backup以后执行的脚本。 notify_fault: 状态改变为fault后执行的脚本。 notify_stop: VRRP停止以后执行的脚本。
notify_down: real_server 停止以后执行的脚本。
vim master_master.sh
#!/bin/bash
[ -e /etc/profile ] && source /etc/profile || exit 0
mysql -uroot -p1qaz2wsx -e "set global read_only=0;"
mysql -uroot -p1qaz2wsx -e "flush logs;GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, SHOW DATABASES, CREATE TEMPORARY TABLES, EXECUTE, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, TRIGGER ON *.* TO 'app'@'192.168.%.%' IDENTIFIED BY '123456';flush privileges;"
mysql -uroot -p1qaz2wsx -e "show master status;" > /tmp/master_status_$(date "+%y%m%d-%H%M").txt
############################ 定义backup机变为master状态脚本 ###################################
vim backup_master.sh
#!/bin/bash
[ -e /etc/profile ] && source /etc/profile || exit 0
Master_Log_File=$(mysql -uroot -p1qaz2wsx -e "show slave status\G" | grep -w "Master_Log_File" | awk -F": " '{print $2}')
Relay_Master_Log_File=$(mysql -uroot -p1qaz2wsx -e "show slave status\G" | grep -w "Relay_Master_Log_File" | awk -F": " '{print $2}')
Read_Master_Log_Pos=$(mysql -uroot -p1qaz2wsx -e "show slave status\G" | grep -w "Read_Master_Log_Pos" | awk -F": " '{print $2}')
Exec_Master_Log_Pos=$(mysql -uroot -p1qaz2wsx -e "show slave status\G" | grep -w "Exec_Master_Log_Pos" | awk -F": " '{print $2}')
###grep -w是强制匹配整个单词的意思 force PATTERN to match only whole words
################ 设置一个死循环,首先判断复制是否有延迟,如果有延迟,等60秒,不论是否有延迟停止复制线程,执行切换master动作
i=1
while true
do
if [ $Master_Log_File = $Relay_Master_Log_File ] && [ $Read_Master_Log_Pos -eq $Exec_Master_Log_Pos ]
then
echo "slave is no delay"
break
else
sleep 1
if [ $i -gt 60 ]
then
break
fi
continue
let i++
fi
done
mysql -uroot -p1qaz2wsx -e "stop slave;"
mysql -uroot -p1qaz2wsx -e "set global read_only=0;"
mysql -uroot -p1qaz2wsx -e "flush logs;GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, SHOW DATABASES, CREATE TEMPORARY TABLES, EXECUTE, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, TRIGGER ON *.* TO 'app'@'192.168.%.%' IDENTIFIED BY '123456';flush privileges;"
mysql -uroot -p1qaz2wsx -e "show master status;" > /tmp/master_status_$(date "+%y%m%d-%H%M").txt
#当keepalvie的backup提升为master以后,发送邮件告知管理员
echo "#####################################" > /tmp/keepalvie.txt
echo "双主已经发生切换了,请检查并修复问题" >> /tmp/keepalvie.txt
ip add | grep -i eth0 | grep inet | awk {'print $2'}|head -n 1 >> /tmp/keepalvie.txt
echo "切换到当前这一台了" >> /tmp/keepalvie.txt
echo "#####################################" >> /tmp/keepalvie.txt
##########发生邮件我是用sendmail这个软件。需要独立安装
/usr/local/sendemail/sendEmail -f xxxxx@139.com -t xxxx@139.com -s smtp.139.com -u ""双主已经发生切换了,请检查并修复问题" -xu xxx@139.com -xp xxx -m < /tmp/keepalvie.txt
##########backup 状态就比较简单,设置只读模式,删除app用户,防止误连接
vim backup.sh
#!/bin/bash
[ -e /etc/profile ] && source /etc/profile || exit 0
mysql -uroot -p1qaz2wsx -e "set global read_only=1;"
mysql -uroot -p1qaz2wsx -e "use mysql;delete from user where user.user='app';flush privileges;"
##########stop.sh因为检测到端口失败,所以检测mysql进程是否还活着,状态是否正常,不正常切换VIP
vim stopkp.sh
#!/bin/bash
[ -e /etc/profile ] && source /etc/profile || exit 0
mysqlstatus1=`mysqladmin -uroot -p1qaz2wsx ping | grep -o alive`
mysqlstatus2=`mysql -uroot -p1qaz2wsx -e "show status;"| grep -i Com | wc -l`
mysqlpid=`ps aux | grep mysqld | grep -v grep| wc -l`
echo $mysqlstatus1
echo $mysqlstatus2
echo $mysqlpid
if [ "$mysqlstatus1" != "alive" -o "$mysqlpid" == "0" -o "$mysqlstatus2" == "0" ]
then
echo "mysql is not running stop keepalive move vip"
/etc/init.d/keepalived stop
else
echo "mysql is running exit shell"
exit 0
fi
五、验证高可用
在master1上
root@localhost(mytest) node1 > show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.80.107
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: dg7-logbin.000011
Read_Master_Log_Pos: 107
Relay_Log_File: dg6-relay-bin.000021
Relay_Log_Pos: 254
Relay_Master_Log_File: dg7-logbin.000011
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 107
Relay_Log_Space: 555
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 11
1 row in set (0.00 sec)
root@localhost(mytest) node1 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 13 |
+----------+
1 row in set (0.03 sec)
插入一些数据
root@localhost(mytest) node1 >
root@localhost(mytest) node1 > insert into test (c1,c2) values ('adf','adfa');
Query OK, 1 row affected (0.02 sec)
root@localhost(mytest) node1 > insert into test (c1,c2) values ('adf','adfaadf');
Query OK, 1 row affected (0.02 sec)
root@localhost(mytest) node1 > insert into test (c1,c2) values ('adf','adadfdf');
Query OK, 1 row affected (0.01 sec)
root@localhost(mytest) node1 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 16 |
+----------+
1 row in set (0.00 sec)
root@localhost(mytest) node1 >
在master2上
root@localhost(mytest) node2 > show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.80.106
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: dg6-logbin.000001
Read_Master_Log_Pos: 5064
Relay_Log_File: dg7-relay-bin.000021
Relay_Log_Pos: 968
Relay_Master_Log_File: dg6-logbin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 5064
Relay_Log_Space: 1122
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
root@localhost(mytest) node2 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 16 |
+----------+
1 row in set (0.00 sec)
root@localhost(mytest) node2 >
同步没异常,插入几条记录
###########
root@localhost(mytest) node2 > insert into test (c1,c2) values ('adf','hahahaha');
Query OK, 1 row affected (0.01 sec)
root@localhost(mytest) node2 > insert into test (c1,c2) values ('adf','hahahaha');
Query OK, 1 row affected (0.01 sec)
root@localhost(mytest) node2 > insert into test (c1,c2) values ('adf','hahahaha');
Query OK, 1 row affected (0.02 sec)
root@localhost(mytest) node2 > insert into test (c1,c2) values ('adf','hahahaha');
Query OK, 1 row affected (0.00 sec)
root@localhost(mytest) node2 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 20 |
+----------+
1 row in set (0.00 sec)
root@localhost(mytest) node2 >
然后会master1看看
root@localhost(mytest) node1 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 16 |
+----------+
1 row in set (0.00 sec)
root@localhost(mytest) node1 > select count(*) from test;
+----------+
| count(*) |
+----------+
| 20 |
+----------+
1 row in set (0.00 sec)
root@localhost(mytest) node1 >
双主没问题的。
master启动日志
Jan :: dg6 Keepalived[]: Starting Keepalived v1.2.13 (/,) Jan :: dg6 Keepalived[]: Starting Healthcheck child process, pid= Jan :: dg6 Keepalived[]: Starting VRRP child process, pid= Jan :: dg6 Keepalived_healthcheckers[]: Initializing ipvs 2.6 Jan :: dg6 Keepalived_vrrp[]: Netlink reflector reports IP 192.168.80.106 added Jan :: dg6 Keepalived_vrrp[]: Netlink reflector reports IP fe80::20c:29ff:feb2: added Jan :: dg6 Keepalived_vrrp[]: Registering Kernel netlink reflector Jan :: dg6 Keepalived_vrrp[]: Registering Kernel netlink command channel Jan :: dg6 Keepalived_vrrp[]: Registering gratuitous ARP shared channel Jan :: dg6 Keepalived_healthcheckers[]: Netlink reflector reports IP 192.168.80.106 added Jan :: dg6 Keepalived_healthcheckers[]: Netlink reflector reports IP fe80::20c:29ff:feb2: added Jan :: dg6 Keepalived_healthcheckers[]: Registering Kernel netlink reflector Jan :: dg6 Keepalived_healthcheckers[]: Registering Kernel netlink command channel Jan :: dg6 Keepalived_vrrp[]: Opening file '/usr/local/keepalive/ha.conf'. Jan :: dg6 Keepalived_healthcheckers[]: Opening file '/usr/local/keepalive/ha.conf'. Jan :: dg6 Keepalived_vrrp[]: Configuration is using : Bytes Jan :: dg6 Keepalived_vrrp[]: ------< Global definitions >------ Jan :: dg6 Keepalived_vrrp[]: Router ID = Jan :: dg6 Keepalived_vrrp[]: Smtp server = 127.0.0.1 Jan :: dg6 Keepalived_vrrp[]: Smtp server connection timeout = Jan :: dg6 Keepalived_vrrp[]: Email notification from = Alexandre.Cassen@firewall.loc Jan :: dg6 Keepalived_vrrp[]: Email notification = acassen@firewall.loc Jan :: dg6 Keepalived_vrrp[]: Email notification = failover@firewall.loc Jan :: dg6 Keepalived_vrrp[]: Email notification = sysadmin@firewall.loc Jan :: dg6 Keepalived_vrrp[]: VRRP IPv4 mcast group = 224.0.0.18 Jan :: dg6 Keepalived_vrrp[]: VRRP IPv6 mcast group = 224.0.0.18 Jan :: dg6 Keepalived_vrrp[]: ------< VRRP Topology >------ Jan :: dg6 Keepalived_vrrp[]: VRRP Instance = VI_1 Jan :: dg6 Keepalived_vrrp[]: Want State = BACKUP Jan :: dg6 Keepalived_vrrp[]: Runing on device = eth0 Jan :: dg6 Keepalived_vrrp[]: Virtual Router ID = Jan :: dg6 Keepalived_vrrp[]: Priority = Jan :: dg6 Keepalived_vrrp[]: Advert interval = 1sec Jan :: dg6 Keepalived_vrrp[]: Preempt disabled Jan :: dg6 Keepalived_vrrp[]: Authentication type = SIMPLE_PASSWORD Jan :: dg6 Keepalived_vrrp[]: Password = Jan :: dg6 Keepalived_vrrp[]: Virtual IP = Jan :: dg6 Keepalived_vrrp[]: 192.168.80.80/ dev eth0 scope global Jan :: dg6 Keepalived_vrrp[]: Backup state transition script = /usr/local/keepalive/backup.sh Jan :: dg6 Keepalived_vrrp[]: Master state transition script = /usr/local/keepalive/master_master.sh Jan :: dg6 Keepalived_healthcheckers[]: Configuration is using : Bytes Jan :: dg6 Keepalived_vrrp[]: Using LinkWatch kernel netlink reflector... Jan :: dg6 Keepalived_healthcheckers[]: ------< Global definitions >------ Jan :: dg6 Keepalived_healthcheckers[]: Router ID = Jan :: dg6 Keepalived_healthcheckers[]: Smtp server = 127.0.0.1 Jan :: dg6 Keepalived_healthcheckers[]: Smtp server connection timeout = Jan :: dg6 Keepalived_healthcheckers[]: Email notification from = Alexandre.Cassen@firewall.loc Jan :: dg6 Keepalived_healthcheckers[]: Email notification = acassen@firewall.loc Jan :: dg6 Keepalived_healthcheckers[]: Email notification = failover@firewall.loc Jan :: dg6 Keepalived_healthcheckers[]: Email notification = sysadmin@firewall.loc Jan :: dg6 Keepalived_healthcheckers[]: VRRP IPv4 mcast group = 224.0.0.18 Jan :: dg6 Keepalived_healthcheckers[]: VRRP IPv6 mcast group = 224.0.0.18 Jan :: dg6 Keepalived_healthcheckers[]: ------< SSL definitions >------ Jan :: dg6 Keepalived_healthcheckers[]: Using autogen SSL context Jan :: dg6 Keepalived_healthcheckers[]: ------< LVS Topology >------ Jan :: dg6 Keepalived_healthcheckers[]: System is compiled with LVS v1.2.1 Jan :: dg6 Keepalived_healthcheckers[]: VIP = 192.168.80.80, VPORT = Jan :: dg6 Keepalived_healthcheckers[]: delay_loop = , lb_algo = rr Jan :: dg6 Keepalived_healthcheckers[]: persistence timeout = Jan :: dg6 Keepalived_healthcheckers[]: protocol = TCP Jan :: dg6 Keepalived_healthcheckers[]: alpha is OFF, omega is OFF Jan :: dg6 Keepalived_healthcheckers[]: quorum = , hysteresis = Jan :: dg6 Keepalived_healthcheckers[]: lb_kind = DR Jan :: dg6 Keepalived_healthcheckers[]: RIP = 192.168.80.106, RPORT = , WEIGHT = Jan :: dg6 Keepalived_healthcheckers[]: -> Notify script DOWN = /usr/local/keepalive/stopkp.sh Jan :: dg6 Keepalived_healthcheckers[]: ------< Health checkers >------ Jan :: dg6 Keepalived_healthcheckers[]: [192.168.80.106]: Jan :: dg6 Keepalived_healthcheckers[]: Keepalive method = TCP_CHECK Jan :: dg6 Keepalived_healthcheckers[]: Connection dest = [192.168.80.106]: Jan :: dg6 Keepalived_healthcheckers[]: Connection timeout = Jan :: dg6 Keepalived_healthcheckers[]: Using LinkWatch kernel netlink reflector... Jan :: dg6 Keepalived_healthcheckers[]: Activating healthchecker for service [192.168.80.106]: Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) Entering BACKUP STATE Jan :: dg6 Keepalived_vrrp[]: Opening script file /usr/local/keepalive/backup.sh Jan :: dg6 Keepalived_vrrp[]: VRRP sockpool: [ifindex(), proto(), unicast(), fd(,)] Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) Transition to MASTER STATE Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) Entering MASTER STATE Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) setting protocol VIPs. Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.80.80 Jan :: dg6 Keepalived_healthcheckers[]: Netlink reflector reports IP 192.168.80.80 added Jan :: dg6 Keepalived_vrrp[]: Opening script file /usr/local/keepalive/master_master.sh Jan :: dg6 Keepalived_vrrp[]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.80.80
看看有没有VIP了
[root@dg6 keepalive]# ip add | grep eth
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:b2:48:87 brd ff:ff:ff:ff:ff:ff
inet 192.168.80.106/24 brd 192.168.80.255 scope global eth0
inet 192.168.80.80/32 scope global eth0
[root@dg6 keepalive]#
试试连VIP看看
[root@dg1 ~]# ping 192.168.80.80
PING 192.168.80.80 (192.168.80.80) 56(84) bytes of data.
64 bytes from 192.168.80.80: icmp_seq=1 ttl=64 time=0.398 ms
64 bytes from 192.168.80.80: icmp_seq=2 ttl=64 time=0.485 ms
64 bytes from 192.168.80.80: icmp_seq=3 ttl=64 time=0.583 ms
backup启动日志
Jan :: dg7 Keepalived[]: Starting Keepalived v1.2.13 (/,) Jan :: dg7 Keepalived[]: Starting Healthcheck child process, pid= Jan :: dg7 Keepalived[]: Starting VRRP child process, pid= Jan :: dg7 Keepalived_healthcheckers[]: Initializing ipvs 2.6 Jan :: dg7 Keepalived_healthcheckers[]: Netlink reflector reports IP 192.168.80.107 added Jan :: dg7 Keepalived_healthcheckers[]: Netlink reflector reports IP fe80::20c:29ff:feb2: added Jan :: dg7 Keepalived_healthcheckers[]: Registering Kernel netlink reflector Jan :: dg7 Keepalived_healthcheckers[]: Registering Kernel netlink command channel Jan :: dg7 Keepalived_vrrp[]: Netlink reflector reports IP 192.168.80.107 added Jan :: dg7 Keepalived_vrrp[]: Netlink reflector reports IP fe80::20c:29ff:feb2: added Jan :: dg7 Keepalived_vrrp[]: Registering Kernel netlink reflector Jan :: dg7 Keepalived_vrrp[]: Registering Kernel netlink command channel Jan :: dg7 Keepalived_vrrp[]: Registering gratuitous ARP shared channel Jan :: dg7 Keepalived_vrrp[]: Opening file '/usr/local/keepalive/ha.conf'. Jan :: dg7 Keepalived_healthcheckers[]: Opening file '/usr/local/keepalive/ha.conf'. Jan :: dg7 Keepalived_vrrp[]: Configuration is using : Bytes Jan :: dg7 Keepalived_vrrp[]: ------< Global definitions >------ Jan :: dg7 Keepalived_vrrp[]: Router ID = Jan :: dg7 Keepalived_vrrp[]: Smtp server = 127.0.0.1 Jan :: dg7 Keepalived_vrrp[]: Smtp server connection timeout = Jan :: dg7 Keepalived_vrrp[]: Email notification from = Alexandre.Cassen@firewall.loc Jan :: dg7 Keepalived_vrrp[]: Email notification = acassen@firewall.loc Jan :: dg7 Keepalived_vrrp[]: Email notification = failover@firewall.loc Jan :: dg7 Keepalived_vrrp[]: Email notification = sysadmin@firewall.loc Jan :: dg7 Keepalived_vrrp[]: VRRP IPv4 mcast group = 224.0.0.18 Jan :: dg7 Keepalived_vrrp[]: VRRP IPv6 mcast group = 224.0.0.18 Jan :: dg7 Keepalived_vrrp[]: ------< VRRP Topology >------ Jan :: dg7 Keepalived_vrrp[]: VRRP Instance = VI_1 Jan :: dg7 Keepalived_vrrp[]: Want State = BACKUP Jan :: dg7 Keepalived_vrrp[]: Runing on device = eth0 Jan :: dg7 Keepalived_vrrp[]: Virtual Router ID = Jan :: dg7 Keepalived_vrrp[]: Priority = Jan :: dg7 Keepalived_vrrp[]: Advert interval = 1sec Jan :: dg7 Keepalived_vrrp[]: Preempt disabled Jan :: dg7 Keepalived_healthcheckers[]: Configuration is using : Bytes Jan :: dg7 Keepalived_vrrp[]: Authentication type = SIMPLE_PASSWORD Jan :: dg7 Keepalived_vrrp[]: Password = Jan :: dg7 Keepalived_vrrp[]: Virtual IP = Jan :: dg7 Keepalived_vrrp[]: 192.168.80.80/ dev eth0 scope global Jan :: dg7 Keepalived_vrrp[]: Backup state transition script = /usr/local/keepalive/backup.sh Jan :: dg7 Keepalived_vrrp[]: Master state transition script = /usr/local/keepalive/backup_master.sh Jan :: dg7 Keepalived_vrrp[]: Using LinkWatch kernel netlink reflector... Jan :: dg7 Keepalived_healthcheckers[]: ------< Global definitions >------ Jan :: dg7 Keepalived_healthcheckers[]: Router ID = Jan :: dg7 Keepalived_healthcheckers[]: Smtp server = 127.0.0.1 Jan :: dg7 Keepalived_healthcheckers[]: Smtp server connection timeout = Jan :: dg7 Keepalived_healthcheckers[]: Email notification from = Alexandre.Cassen@firewall.loc Jan :: dg7 Keepalived_healthcheckers[]: Email notification = acassen@firewall.loc Jan :: dg7 Keepalived_healthcheckers[]: Email notification = failover@firewall.loc Jan :: dg7 Keepalived_healthcheckers[]: Email notification = sysadmin@firewall.loc Jan :: dg7 Keepalived_healthcheckers[]: VRRP IPv4 mcast group = 224.0.0.18 Jan :: dg7 Keepalived_healthcheckers[]: VRRP IPv6 mcast group = 224.0.0.18 Jan :: dg7 Keepalived_healthcheckers[]: ------< SSL definitions >------ Jan :: dg7 Keepalived_healthcheckers[]: Using autogen SSL context Jan :: dg7 Keepalived_healthcheckers[]: ------< LVS Topology >------ Jan :: dg7 Keepalived_healthcheckers[]: System is compiled with LVS v1.2.1 Jan :: dg7 Keepalived_healthcheckers[]: VIP = 192.168.80.80, VPORT = Jan :: dg7 Keepalived_healthcheckers[]: delay_loop = , lb_algo = rr Jan :: dg7 Keepalived_healthcheckers[]: persistence timeout = Jan :: dg7 Keepalived_healthcheckers[]: protocol = TCP Jan :: dg7 Keepalived_healthcheckers[]: alpha is OFF, omega is OFF Jan :: dg7 Keepalived_healthcheckers[]: quorum = , hysteresis = Jan :: dg7 Keepalived_healthcheckers[]: lb_kind = DR Jan :: dg7 Keepalived_healthcheckers[]: RIP = 192.168.80.107, RPORT = , WEIGHT = Jan :: dg7 Keepalived_healthcheckers[]: -> Notify script DOWN = /usr/local/keepalive/stopkp.sh Jan :: dg7 Keepalived_healthcheckers[]: ------< Health checkers >------ Jan :: dg7 Keepalived_healthcheckers[]: [192.168.80.107]: Jan :: dg7 Keepalived_healthcheckers[]: Keepalive method = TCP_CHECK Jan :: dg7 Keepalived_healthcheckers[]: Connection dest = [192.168.80.107]: Jan :: dg7 Keepalived_healthcheckers[]: Connection timeout = Jan :: dg7 Keepalived_healthcheckers[]: Using LinkWatch kernel netlink reflector... Jan :: dg7 Keepalived_healthcheckers[]: Activating healthchecker for service [192.168.80.107]: Jan :: dg7 Keepalived_vrrp[]: VRRP_Instance(VI_1) Entering BACKUP STATE Jan :: dg7 Keepalived_vrrp[]: Opening script file /usr/local/keepalive/backup.sh Jan :: dg7 Keepalived_vrrp[]: VRRP sockpool: [ifindex(), proto(), unicast(), fd(,)]
发现一个很纳闷的问题,为什么master的app帐号老是会随着backup启动变没了呢,
mysql> select * from user where user='app'\G; *************************** . row *************************** Host: 192.168.%.% User: app Password: *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 Select_priv: Y Insert_priv: Y Update_priv: Y Delete_priv: Y Create_priv: Y Drop_priv: N Reload_priv: N Shutdown_priv: N Process_priv: N File_priv: N Grant_priv: N References_priv: N Index_priv: N Alter_priv: N Show_db_priv: Y Super_priv: N Create_tmp_table_priv: Y Lock_tables_priv: N Execute_priv: Y Repl_slave_priv: N Repl_client_priv: N Create_view_priv: Y Show_view_priv: Y Create_routine_priv: Y Alter_routine_priv: Y Create_user_priv: N Event_priv: N Trigger_priv: Y Create_tablespace_priv: N ssl_type: ssl_cipher: x509_issuer: x509_subject: max_questions: max_updates: max_connections: max_user_connections: plugin: authentication_string: NULL row in set (0.00 sec) ERROR: No query specified mysql> mysql> select * from user where user='app'; Empty set (0.00 sec)
原来是backup自动后,状态是backup的 执行了backup脚本,把app用户删除了,二进制日志然后传到master1上执行了这个语句。解决办法很简单,复制过滤,过滤掉mysql库:replicate-ignore-db=mysql
#!/bin/bash
[ -e /etc/profile ] && source /etc/profile || exit 0
mysql -uroot -p1qaz2wsx -e "set global read_only=1;"
mysql -uroot -p1qaz2wsx -e "use mysql;delete from user where user.user='app';flush privileges;
####################################################################################################################
现在就不会出现帐号被删除的情况了
mysql> show slave status\G *************************** . row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.80.107 Master_User: repl Master_Port: Connect_Retry: Master_Log_File: dg7-logbin. Read_Master_Log_Pos: Relay_Log_File: dg6-relay-bin. Relay_Log_Pos: Relay_Master_Log_File: dg7-logbin. Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: mysql Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: Last_Error: Skip_Counter: Exec_Master_Log_Pos: Relay_Log_Space: Until_Condition: None Until_Log_File: Until_Log_Pos: Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: Master_SSL_Verify_Server_Cert: No Last_IO_Errno: Last_IO_Error: Last_SQL_Errno: Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: row in set (0.00 sec) mysql> mysql> select * from user where user='app'\G; *************************** . row *************************** Host: 192.168.%.% User: app Password: *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 Select_priv: Y Insert_priv: Y Update_priv: Y Delete_priv: Y Create_priv: Y Drop_priv: N Reload_priv: N Shutdown_priv: N Process_priv: N File_priv: N Grant_priv: N References_priv: N Index_priv: N Alter_priv: N Show_db_priv: Y Super_priv: N Create_tmp_table_priv: Y Lock_tables_priv: N Execute_priv: Y Repl_slave_priv: N Repl_client_priv: N Create_view_priv: Y Show_view_priv: Y Create_routine_priv: Y Alter_routine_priv: Y Create_user_priv: N Event_priv: N Trigger_priv: Y Create_tablespace_priv: N ssl_type: ssl_cipher: x509_issuer: x509_subject: max_questions: max_updates: max_connections: max_user_connections: plugin: authentication_string: NULL row in set (0.00 sec) ERROR: No query specified mysql>
测试连接VIP,木有问题。
接下来我们试试故障切换,关闭master1的mysql,看看master做了写啥子
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: TCP connection to [192.168.80.106]:3306 failed !!!
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: Removing service [192.168.80.106]:3306 from VS [192.168.80.80]:0
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: Executing [/usr/local/keepalive/stopkp.sh] for service [192.168.80.106]:3306 in VS [192.168.80.80]:0
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: Lost quorum 1-0=1 > 0 for VS [192.168.80.80]:0
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: SMTP connection ERROR to [127.0.0.1]:25.
Jan 8 14:42:59 dg6 Keepalived[18700]: Stopping Keepalived v1.2.13 (12/13,2014)
Jan 8 14:42:59 dg6 Keepalived_vrrp[18703]: VRRP_Instance(VI_1) sending 0 priority
Jan 8 14:42:59 dg6 Keepalived_vrrp[18703]: VRRP_Instance(VI_1) removing protocol VIPs.
Jan 8 14:42:59 dg6 Keepalived_healthcheckers[18702]: Netlink reflector reports IP 192.168.80.80 removed
执行了/usr/local/keepalive/stopkp.sh脚本,关闭keepalive VIP飘移
看看backup是啥子情况
Jan 8 14:42:59 dg7 Keepalived_vrrp[17534]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jan 8 14:43:00 dg7 Keepalived_vrrp[17534]: VRRP_Instance(VI_1) Entering MASTER STATE
Jan 8 14:43:00 dg7 Keepalived_vrrp[17534]: VRRP_Instance(VI_1) setting protocol VIPs.
Jan 8 14:43:00 dg7 Keepalived_vrrp[17534]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.80.80
Jan 8 14:43:00 dg7 Keepalived_vrrp[17534]: Opening script file /usr/local/keepalive/backup_master.sh
Jan 8 14:43:00 dg7 Keepalived_healthcheckers[17533]: Netlink reflector reports IP 192.168.80.80 added
Jan 8 14:43:05 dg7 Keepalived_vrrp[17534]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.80.80
转变为master状态,执行了我们定义的脚本
那么我们看看,master2上是不是已经设提供写服务呢
root@localhost(mytest) node2 > show global variables like '%read_only%'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | OFF | +---------------+-------+ row in set (0.00 sec) root@localhost(mysql) node2 > select * from user where user='app'\G; *************************** . row *************************** Host: 192.168.%.% User: app Password: *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 Select_priv: Y Insert_priv: Y Update_priv: Y Delete_priv: Y Create_priv: Y Drop_priv: N Reload_priv: N Shutdown_priv: N Process_priv: N File_priv: N Grant_priv: N References_priv: N Index_priv: N Alter_priv: N Show_db_priv: Y Super_priv: N Create_tmp_table_priv: Y Lock_tables_priv: N Execute_priv: Y Repl_slave_priv: N Repl_client_priv: N Create_view_priv: Y Show_view_priv: Y Create_routine_priv: Y Alter_routine_priv: Y Create_user_priv: N Event_priv: N Trigger_priv: Y Create_tablespace_priv: N ssl_type: ssl_cipher: x509_issuer: x509_subject: max_questions: max_updates: max_connections: max_user_connections: plugin: authentication_string: NULL row in set (0.00 sec) ERROR: No query specified root@localhost(mysql) node2 >
用VIP连连看,没错,连上去了。接下来就要赶紧修复故障的master1。