Keepalived 高可用集群

一、Keepalived 简介

Keepalived是集群管理中保证集群高可用的一个服务软件,它的作用是检测web服务器的状态,如果有一台web服务器死机,或工作出现故障,Keepalived将检测到,并将有故障的web服务器从系统中剔除,当web服务器工作正常后,自动将web服务器加入到服务器集群中。解决了静态路由的单点故障问题。

二、Keepalived 工作原理

Keepalived是以VRRP协议为实现基础的,VRRP全称Virtual Router Redundancy Protocol ,即虚拟路由冗余协议。

虚拟路由冗余协议,可以认为是实现路由器高可用的协议。也就是说N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip,master不断向backup发送心跳信息,告诉backup自己还活着,当backup收不到心跳消息时就认为master已经宕机啦,这时就需要根据VRRP的优先级来选举一个backup当master。从而保证高可用。

Keepalived主要有三个模块,分别是 core、check 和 vrrp。

  • core 模块为 keepalived 的核心,负责主进程的启动、维护、以及全局配置文件的加载和解析。

  • check 负责健康检查,包括常见的各种检查方式。

  • vrrp 模块是来实现 VRRP 协议的。

三、Keepalived 配置文件

Keepalived 只有一个配置文件 keepalived.conf,里面主要包括以下几个配置区域,分别是

  • global_defs

  • static_ipaddress

  • vrrp_script

  • vrrp_instance

  • virtual_server

1、global_defs 区域

主要是配置故障发生时的通知对象以及机器标志

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id 192.168.224.206
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}
  • notification_email 故障发生时给谁发邮件通知
  • notification_email_from 通知邮件从哪个地址发出
  • smtp_server 通知邮件的smtp地址
  • smtp_connect_timeout 连接smtp服务器的超时时间
  • enable_traps开启SNMP(Simple Network Management Protocol)陷阱
  • router_id 标志本节点的字符串,通常为ip地址,故障发生时邮件会通知到

2、vrrp_script 区域

用来做健康检查的,当检查失败时会将 vrrp_instance 的 priority 减少相应的值

vrrp_script chk_nginx {
       script "/usr/local/keepalived-1.3.4/nginx_check.sh"
       interval 2 
       weight -20
}
  • script: 自己写的监测脚本。

  • interval 2: 每2s监测一次

  • weight -20:监测失败,则相应的vrrp_instance的优先级会减少20个点

3、vrrp_instance 区域

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    mcast_src_ip 192.168.224.206
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.224.208
    }
    track_script{
    chk_nginx
  }
}
  • state:只有BACKUP和MASTER。MASTER为工作状态,BACKUP是备用状态

  • interface:为网卡接口:可通过ip addr查看自己的网卡接口

  • virtual_router_id:虚拟路由标志。同组的virtual_router_id应该保持一致。它将决定多播的MAC地址

  • priority:设置本节点的优先级,优先级高的为master

  • advert_int: MASTER与BACKUP同步检查的时间间隔

  • virtual_ipaddress:这就是传说中的虚拟ip

四、Keepalived实战项目

1、Haproxy_Director + Keepalived

一、Haproxy负载均衡
主/备调度器均能够实现正常调度

二、Keepalived实现调度器HA
注:主/备调度器均能够实现正常调度
1. 主/备调度器安装软件
yum安装的方式:
[root@master ~]# yum -y install keepalived 
[root@backup ~]# yum -y install keepalived 
编译安装的方式:
[root@zdns.cn ~]# yum -y install ipvsadm kernel-headers kernel-devel openssl-devel popt-devel
[root@zdns.cn ~]# wget http://www.keepalived.org/software/keepalived-1.2.2.tar.gz
[root@zdns.cn ~]# tar zxvf keepalived-1.2.2.tar.gz
[root@zdns.cn ~]# cd keepalived-1.2.2
[root@zdns.cn ~]# ./configure --prefix=/
[root@zdns.cn ~]# make 
[root@zdns.cn ~]# make install

2. Keepalived
Master 
[root@zdns.cn ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id director1	        # 辅助改为director2
}

vrrp_instance VI_1 {
    state BACKUP
    nopreempt				
    interface eth0				# VIP绑定接口
    virtual_router_id 80		# MASTER,BACKUP一致
    priority 100			    # 辅助改为50
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.122.100
    }
}

 


3. 启动KeepAlived(主备均启动)
[root@zdns.cn ~]# chkconfig keepalived on
[root@zdns.cn ~]# service keepalived start
[root@zdns.cn ~]# ip addr


4. 扩展对调度器Haproxy健康检查(可选)
思路:
让Keepalived以一定时间间隔执行一个外部脚本,脚本的功能是当Haproxy失败,则关闭本机的Keepalived
a. script
[root@master ~]# cat /etc/keepalived/check_haproxy_status.sh
#!/bin/bash											        	
/usr/bin/curl -I http://localhost &>/dev/null	
if [ $? -ne 0 ];then									    	
	/etc/init.d/keepalived stop					    	
fi															        	
[root@master ~]# chmod a+x /etc/keepalived/check_haproxy_status.sh

b. keepalived使用script
! Configuration File for keepalived

global_defs {
   router_id director1
}

vrrp_script check_haproxy {
   script "/etc/keepalived/check_haproxy_status.sh"
   interval 5
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    nopreempt
    virtual_router_id 90
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass qfedu
    }
    virtual_ipaddress {
        192.168.122.100
    }

    track_script {
        check_haproxy
    }
}

2、Nginx_Director + Keepalived

一、Nginx负载均衡
主/备调度器均能够实现正常调度


二、Keepalived实现调度器HA
1. 主/备调度器安装软件
[root@master ~]# yum -y install keepalived 
[root@backup ~]# yum -y install keepalived

2. Keepalived
BACKUP1
[root@zdns.cn ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     root@localhost     
   }
   notification_email_from root@local.domain
   smtp_server 10.8.16.10
   smtp_connect_timeout 30
   router_id SCRM-MySQL-11
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.8.16.50
    }
}

BACKUP2


3. 启动KeepAlived(主备均启动)
[root@zdns.cn ~]# chkconfig keepalived on
[root@zdns.cn ~]# service keepalived start
[root@zdns.cn ~]# ip addr

到此:
可以解决心跳故障  keepalived
不能解决Nginx服务故障

4. 扩展对调度器Nginx健康检查(可选)
思路:
让Keepalived以一定时间间隔执行一个外部脚本,脚本的功能是当Nginx失败,则关闭本机的Keepalived
a. script
[root@master ~]# cat /etc/keepalived/check_nginx_status.sh
#!/bin/bash											        	
/usr/bin/curl -I http://localhost &>/dev/null	
if [ $? -ne 0 ];then									    	
	/etc/init.d/keepalived stop					    	
fi															        	
[root@master ~]# chmod a+x /etc/keepalived/check_nginx_status.sh

b. keepalived使用script
! Configuration File for keepalived

global_defs {
   router_id director1
}

vrrp_script check_nginx {
   script "/etc/keepalived/check_nginx_status.sh"
   interval 5
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    nopreempt
    virtual_router_id 90
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass qfedu
    }
    
    virtual_ipaddress {
        192.168.1.80
    }

    track_script {
        check_nginx
    }
}

注:必须先启动nginx,再启动keepalived

3、MySQL + Keepalived

Keepalived+mysql 自动切换

项目环境:
VIP 192.168.122.100
mysql1 192.168.122.10
mysql2 192.168.122.20
 

实施步骤:
一、keepalived 主备配置文件
192.168.122.10 Master配置
[root@zdns.cn ~]# vim /etc/keepalived/keepalived.conf
=====================================================================
! Configuration File for keepalived

global_defs {
   router_id mysql1
}

vrrp_script check_run {
   script "/root/keepalived_check_mysql.sh"
   interval 5
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 88
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass qfedu
    }

    track_script {
        check_run
    }

    virtual_ipaddress {
        192.168.122.100
    }
}

=====================================================================



192.168.122.20 Slave配置
[root@zdns.cn ~]# vim /etc/keepalived/keepalived.conf
=====================================================================
! Configuration File for keepalived

global_defs {
   router_id mysql2
}

vrrp_script check_run {
   script "/root/keepalived_check_mysql.sh"
   interval 5
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 88
    priority 50
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass qfedu
    }

    track_script {
        check_run
    }

    virtual_ipaddress {
        192.168.122.100
    }
}

1. 注意空格
2. 日志查看脚本是否被执行
[root@xen2 ~]# tail -f /var/log/messages 
Jun 19 15:20:19 xen1 Keepalived_vrrp[6341]: Using LinkWatch kernel netlink reflector...
Jun 19 15:20:19 xen1 Keepalived_vrrp[6341]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]
Jun 19 15:20:19 xen1 Keepalived_vrrp[6341]: VRRP_Script(check_run) succeeded

=====================================================================


二、mysql状态检测脚本/root/keepalived_check_mysql.sh(两台MySQL同样的脚本)
版本一:简单使用:
#!/bin/bash
/usr/bin/mysql -uroot -p123 -e "show status" &>/dev/null 
if [ $? -ne 0 ] ;then 
	service keepalived stop
fi 


版本二:检查多次:
[root@zdns.cn ~]#  vim  /root/keepalived_check_mysql.sh
#!/bin/bash 
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost 
MYSQL_USER=root 
MYSQL_PASSWORD=qfedu
CHECK_TIME=3

#mysql  is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0 
MYSQL_OK=1
 
check_mysql_helth (){ 
    $MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status" &>/dev/null 
    if [ $? -eq 0 ] ;then 
    	MYSQL_OK=1
    else 
    	MYSQL_OK=0 
    fi 
    return $MYSQL_OK 
}
 
while [ $CHECK_TIME -ne 0 ]
do 
    check_mysql_helth 
	if [ $MYSQL_OK -eq 1 ] ; then 
    		exit 0 
	fi
 
	if [ $MYSQL_OK -eq 0 ] &&  [ $CHECK_TIME -eq 1 ];then
     		/etc/init.d/keepalived stop					
     		exit 1								
 	fi										
	let CHECK_TIME--
 	sleep 1 
done

版本三:检查多次:
[root@zdns.cn ~]#  vim  /root/keepalived_check_mysql.sh
#!/bin/bash 
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost 
MYSQL_USER=root 
MYSQL_PASSWORD=qfedu
CHECK_TIME=3

#mysql  is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0 
MYSQL_OK=1
 
check_mysql_helth (){ 
    $MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status" &>/dev/null 
    if [ $? -eq 0 ] ;then 
    	MYSQL_OK=1
    else 
    	MYSQL_OK=0 
    fi 
    return $MYSQL_OK 
}
 
while [ $CHECK_TIME -ne 0 ]
do 
    check_mysql_helth 
	if [ $MYSQL_OK -eq 1 ] ; then 
    		exit 0 
	fi
 
	let CHECK_TIME--
 	sleep 1 
done

/etc/init.d/keepalived stop					
exit 1				
===================================================

[root@zdns.cn ~]# chmod 755 /root/keepalived_check_mysql.sh

两边均启动keepalived
[root@zdns.cn ~]# /etc/init.d/keepalived start
[root@zdns.cn ~]# /etc/init.d/keepalived start
[root@zdns.cn ~]# chkconfig --add keepalived
[root@zdns.cn ~]# chkconfig keepalived on

4、Lvs_Director + Keepalived

由于lvs的无法监控后端的real server是否宕机,故我们采用keepalived+LVS DR的方式,来监控后端real server的服务,当real server宕机时,不再将请求转发至已经宕机的real server。由于LVS的功能已经嵌套进了keepalived软件里,故我们只需要在调度器(director)上安装keepalived即可,不用安装ipvsadm包,也不需要写lvs_dr.sh脚本,只需要写keepalived的脚本即可。

为了节省时间,这里的高可用集群我只做master主机,不做backup备用机

三台服务器A、B、C:

1、A: load balancer

(调度器dir,分发器)

内网网卡:192.168.31.128,网关保持不变(192.168.31.2)

外网网卡:192.168.229.128,先不用理会,这里用不到

1、备份之前的 keepalived 配置脚本(nginx高可用)
[root@a.zdns.cn  ~]# yum -y install ipvsadm net-tools keepalived
[root@a.zdns.cn  ~]# mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-bak
2、清空 ipvsadm 规则
[root@a.zdns.cn  ~]# ipvsadm -C
3、创建 keepalived 的配置脚本
[root@a.zdns.cn  ~]# vim /etc/keepalived/keepalived.conf
#写入如下内容:
vrrp_instance VI_1 {
state MASTER  # 角色为master
interface eth0 # 访问接口为eth0,vip绑定的网卡名称
virtual_router_id 51 # 虚拟路由id为51,须和backup保持一致
priority 100 # 权重为100,backup稍低一些
advert_int 1
authentication {
auth_type PASS # 验证类型密码
auth_pass 123456 # 验证密码为123456
}
virtual_ipaddress {
192.168.31.200 #vip
}
}
virtual_server 192.168.31.200 80 { # 绑定访问ip及端口
delay_loop 10 # 每隔10秒查询realserver状态
lb_algo wlc lvs # 调度算法
lb_kind DR # lvs转发模式
persistence_timeout 60 # 登陆保持时限为60秒
protocol TCP # 用TCP协议检查realserver状态
real_server 192.168.31.129 80 { # reala server设置
weight 100 # 权重
TCP_CHECK { # 用tcp协议检测
connect_timeout 10 # 连接超时时限为10秒
nb_get_retry 3
delay_before_retry 3
connect_port 80 # 连接端口80
}
}
real_server 192.168.31.130 80 { # 另一台real server配置,同上
weight 100
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}

例子:
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.152.200
}
}
virtual_server 192.168.152.200 80 {
delay_loop 10
lb_algo rr
lb_kind DR
persistence_timeout 60
protocol TCP
real_server 192.168.152.132 80 {
weight 110
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
real_server 192.168.152.133 80 {
weight 100
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}
4、查看 ipvsadm 转发规则
[root@a.zdns.cn  ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
5、启动 keepalived
[root@a.zdns.cn ~]# systemctl start keepalived
6、再次查看 ipvsadm 规则,会发现有了转发规则
[root@a.zdns.cn ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.31.200:80 wlc persistent 60
-> 192.168.31.129:80 Route 100 0 0
-> 192.168.31.130:80 Route 100 0 0

2、B: real server

(web服务器) 内网网卡:192.168.31.129 网关改回129.168.31.2

安装nginx,并启动,在默认主页里写入,real server 1 关闭selinux,清空防火墙规则

1、创建转发脚本
[root@b.zdns.cn ~]# yum -y install net-tools

[root@b.zdns.cn ~]# vim /usr/local/sbin/lvs_rs.sh

写入以下内容:
#/bin/bash
vip=192.168.31.200
#把vip绑定在lo上,是为了实现rs直接把结果返回给客户端
ifdown lo
ifup lo
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up绑定vip到虚拟网卡lo:0上
route add -host $vip lo:0为lo:0网卡添加网关
#以下操作为更改arp内核参数,目的是为了让rs顺利发送mac地址给客户端
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce


例子:
#/bin/bash
vip=192.168.152.200
ifdown lo
ifup lo
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up
route add -host $vip lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
2、给脚本设权
[root@b.zdns.cn ~]# chmod 755 /usr/local/sbin/lvs_rs.sh
3、执行脚本
[root@b.zdns.cn ~]# sh /usr/local/sbin/lvs_rs.sh
4、查看路由上的 vip
[root@b.zdns.cn ~]# route -n
5、查看网卡 lo 上的 vip
[root@b.zdns.cn ~]# ip addr

3、C: real server

(web服务器) 内网网卡:192.168.31.130 网关改回129.168.31.2

安装nginx,并启动,在默认主页里写入,real server 2 关闭selinux,清空防火墙规则

1、创建转发脚本
[root@c.zdns.cn ~]# vim /usr/local/sbin/lvs_rs.sh

写入以下内容:

[root@c.zdns.cn ~]# yum -y install net-tools

[root@c.zdns.cn ~]# vim /usr/local/sbin/lvs_rs.sh

写入以下内容:
#/bin/bash
vip=192.168.31.200
#把vip绑定在lo上,是为了实现rs直接把结果返回给客户端
ifdown lo
ifup lo
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up绑定vip到虚拟网卡lo:0上
route add -host $vip lo:0为lo:0网卡添加网关
#以下操作为更改arp内核参数,目的是为了让rs顺利发送mac地址给客户端
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce


例子:
#/bin/bash
vip=192.168.152.200
ifdown lo
ifup lo
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up
route add -host $vip lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
2、给脚本设权
[root@c.zdns.cn ~]# chmod 755 /usr/local/sbin/lvs_rs.sh
3、执行脚本
[root@c.zdns.cn ~]# sh /usr/local/sbin/lvs_rs.sh
4、查看路由上的 vip
[root@c.zdns.cn ~]# route -n
5、查看网卡 lo 上的 vip
[root@c.zdns.cn ~]# ip addr
6、测试

测试1:

浏览器里访问192.168.31.200,(vip:vitrual ip)多刷新几次看结果,服务器的切换。

浏览器上因为有本地缓存的原因,虽已经设定了登陆保持时限为1秒,但每次刷新都会保持在real server 2主机上。可以在调度机里用 curl 192.168.31.200 测试访问,调度算法采用rr,效果更明显。

测试2:

关闭其中一台real server上的nginx,再次在浏览器上查看real server的切换。

测试3:

重启开启关闭的nginx,再次在浏览器查看real server的切换。

5、keepalived脑裂

脑裂  split barin:
Keepalived的BACKUP主机在收不到MASTER主机报文后就会切换成为master,如果是它们之间的通信线路出现问题,无法接收到彼此的组播通知,但是两个节点实际都处于正常工作状态,这时两个节点均为master强行绑定虚拟IP,导致不可预料的后果,这就是脑裂。
解决方式:
1、添加更多的检测手段,比如冗余的心跳线(两块网卡做健康监测),ping对方等等。尽量减少"裂脑"发生机会。(指标不治本,只是提高了检测到的概率);
2、设置仲裁机制。两方都不可靠,那就依赖第三方。比如启用共享磁盘锁,ping网关等。(针对不同的手段还需具体分析);
3、爆头,将master停掉。然后检查机器之间的防火墙。网络之间的通信
上一篇:P5840-[COCI2015]Divljak 题解


下一篇:Leetcode 1387. 将整数按权重排序