Redis目前高可用的架构非常多,比如keepalived+redis,redis cluster,twemproxy,codis,这些架构各有优劣,今天暂且不说这些架构,今天主要说说redis sentinel高可用架构。
它的主要功能有以下几点
- 不时地监控redis是否按照预期良好地运行;
- 如果发现某个redis节点运行出现状况,能够通知另外一个进程(例如它的客户端);
- 能够进行自动切换。当一个master节点不可用时,能够选举出master的多个slave(如果有超过一个slave的话)中的一个来作为新的master,其它的slave节点会将它所追随的master的地址改为被提升为master的slave的新地址。
关于更加详细的配置以及介绍推荐看完以下文章,我在这里就不多说了,直接进行搭建:
http://segmentfault.com/a/1190000002680804
http://segmentfault.com/a/1190000002685515
redis sentinel的架构如下图:
当然Redis-Sentinel推荐使用3个或者3个以上节点,至于为什么这么做看完我上面给的文章链接。
环境介绍:
Redis Sentinel5台服务器:
10.36.30.203
10.36.30.204
10.37.124.202
10.37.124.203
10.37.124.204
这里不要觉得浪费,这样做是为了更加安全高效的监控redis,且redis Sentinel可以进行复用,也就是可以监控多个Redis实例,所以服务器不存在浪费。
Redis 服务器2台,1主1从:
10.69.25.173 master
10.69.30.170 slave
5台Sentinel的配置文件内容如下:
port
dir "/data/redis/sentinel/26379"
daemonize yes
logfile "/data/redis/sentinel/26379/sentinel.log" #
sentinel monitor master- 10.69.25.173
sentinel down-after-milliseconds master-
sentinel parallel-syncs master-
sentinel failover-timeout master-
sentinel client-reconfig-script master- /sh/redis/notify.py
其中sentinel client-reconfig-script master-6379 /sh/redis/notify.py是在主从切换以后发送告警邮件。其他参数的意义参考我给的文章链接。相关目录自己创建好。
notify.py脚本内容如下,5台服务器上面都需要存在,因为你不知道哪个节点会被选举为leader(网上还没有人提到切换发送告警邮件问题):
#!/usr/bin/python
#coding:utf8 import sys
import time
import smtplib
import logging
from email.mime.text import MIMEText
from email.message import Message
from email.header import Header alarm_mail =['xxxxxx@163.com'] def main(): failover_time=time.strftime("%Y-%m-%d %H:%M:%S") logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
filename='/sh/redis/failover.log',
filemode='a') console = logging.StreamHandler()
console.setLevel(logging.INFO)
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
logging.getLogger('').addHandler(console) mail_host='xxxxx'
mail_port=25
mail_user='xxxxxxx'
mail_pass='xxxxxxxx'
mail_send_from = 'xxxxxxx' def send_mail(to_list,sub,content):
me=mail_send_from
msg = MIMEText(content, _subtype='html', _charset='utf-8')
msg['Subject'] = Header(sub,'utf-8')
msg['From'] = Header(me,'utf-8')
msg['To'] = ";".join(to_list)
try:
smtp = smtplib.SMTP()
smtp.connect(mail_host,mail_port)
smtp.login(mail_user,mail_pass)
smtp.sendmail(me,to_list, msg.as_string())
smtp.close()
return True
except Exception as error:
logging.error("邮件发送失败: %s" % (error))
return False try:
master_name = sys.argv[1]
role = sys.argv[2]
from_ip = sys.argv[4]
from_port = sys.argv[5]
to_ip = sys.argv[6]
to_port = sys.argv[7]
except Exception as error:
logging.error('从 Sentinel 获取参数错误: %s ' % (error))
sys.exit(1) sub='redis %s faiover' % (master_name)
nodify_message = "%s %s is failover end. sentinel find redis master %s:%s is down. failover to slave %s:%s" % (failover_time,master_name,from_ip,from_port,to_ip,to_port) if role == 'leader':
logging.info(nodify_message)
send_mail(alarm_mail,sub,nodify_message) if __name__ == "__main__":
main()
10.69.25.173 master
10.69.30.170 slave
自己安装完成redis,并且搭建好复制关系。
现在分别在5台Sentinel服务器上面启动Sentinel,有2种方式启动。哪两种自己看前面文章。
redis-sentinel sentinel.conf
启动以后随便找一台服务器查看日志,输出如下提示:
[] Dec ::47.161 # Sentinel runid is f3086fc39145cb3d832785899699050d2c7f3b08
[] Dec ::47.161 # +monitor master master- 10.69.25.173 quorum
[] Dec ::47.183 * +slave slave 10.69.30.170: 10.69.30.170 @ master- 10.69.25.173
这里的+slave就表示找到了一个从库。
再看看其他sentinel服务器的日志:
[] Dec ::37.250 # Sentinel runid is 812f9f8b860dcc73d4b587e3bdf85df13808a3cd
[] Dec ::37.250 # +monitor master master- 10.69.25.173 quorum
[] Dec ::38.252 * +slave slave 10.69.30.170: 10.69.30.170 @ master- 10.69.25.173
[] Dec ::38.304 * +sentinel sentinel 10.36.30.204: 10.36.30.204 @ master- 10.69.25.173
[] Dec ::38.388 * +sentinel sentinel 10.37.124.202: 10.37.124.202 @ master- 10.69.25.173
[] Dec ::38.461 * +sentinel sentinel 10.37.124.203: 10.37.124.203 @ master- 10.69.25.173
[] Dec ::39.423 * +sentinel sentinel 10.37.124.204: 10.37.124.204 @ master- 10.69.25.173
+sentinel表示发现了其他的sentinel服务器。现在整个集群就已经工作了。
首先进入sentinel查看现在的主节点是哪台服务器(随便哪台sentinel都可以):
redis-cli -p
127.0.0.1:> info Sentinel
# Sentinel
sentinel_masters:
sentinel_tilt:
sentinel_running_scripts:
sentinel_scripts_queue_length:
master0:name=master-,status=ok,address=10.69.25.173:,slaves=,sentinels=
127.0.0.1:>
可以看到现在的主库是10.69.25.173:6379。现在我们把这台服务器的redis进程kill掉,查看是否会进行切换:
pkill - redis
再次查看,发现主库已经是原来的从库了。
而且还会收到告警邮件,内容如下:
127.0.0.1:> info Sentinel
# Sentinel
sentinel_masters:
sentinel_tilt:
sentinel_running_scripts:
sentinel_scripts_queue_length:
master0:name=master-,status=ok,address=10.69.30.170:,slaves=,sentinels=
127.0.0.1:>
同样的,如果把刚才kill掉的reids重新启动,又会把启动的redis设置为10.69.30.170的从库。
[] Dec ::48.921 # +new-epoch
[] Dec ::48.933 # +vote-for-leader 92517289efcb4ae695eff3e064fde7f4e0e43a1f
[] Dec ::48.955 # +sdown master master- 10.69.25.173
[] Dec ::48.955 # +odown master master- 10.69.25.173 #quorum /
[] Dec ::48.955 # Next failover delay: I will not start a failover before Sat Dec ::
[] Dec ::50.067 # +config-update-from sentinel 10.37.124.203: 10.37.124.203 @ master- 10.69.25.173
[] Dec ::50.067 # +switch-master master- 10.69.25.173 10.69.30.170
[] Dec ::50.067 * +slave slave 10.69.25.173: 10.69.25.173 @ master- 10.69.30.170
[] Dec ::05.109 # +sdown slave 10.69.25.173: 10.69.25.173 @ master- 10.69.30.170
[] Dec ::19.241 # -sdown slave 10.69.25.173: 10.69.25.173 @ master- 10.69.30.170
[] Dec ::29.219 * +convert-to-slave slave 10.69.25.173:6379 10.69.25.173 6379 @ master-6379 10.69.30.170 6379
那么客户端如何知道主从进行切换了呢,如果是java那么有jedis客户端比较方便,如果是php,python语言呢,我们可以自己进行判断。当然还有另外一种方法就是采用dns,修改dns解析。
我这里用python简单写了一个daemon,不会php,哎。
#!/usr/bin/python
import redis
import os sentinel_server=['10.36.30.203:26379','10.36.30.204:26379','10.37.124.202:26379','10.37.124.203:26379','10.37.124.204:26379'] def queue(host,port):
str=''.join(map(lambda xx:(hex(ord(xx))[2:]),os.urandom(16)))
pool = redis.ConnectionPool(host=host, port=port, db=0)
r = redis.Redis(connection_pool=pool)
r.lpush('low_task_queue',str) def get_sentinel():
global master_host
global master_port for info in sentinel_server:
host=info.split(':')[0]
port=info.split(':')[1]
try:
r = redis.Redis(host=host, port=port)
info=r.info('sentinel')['master0']['address'].split(':')
master_host=info[0]
master_port=info[1]
except Exception as error:
print 'concat to sentinel error: %s' % (error)
pass
else:
break if __name__ == "__main__":
get_sentinel()
while True:
try:
queue(master_host,master_port)
except Exception as error:
print 'conct redis error %s' % (error)
get_sentinel()
continue
如果引入dns,那么架构图可以是下面这样:
以上就是简单的测试了,更多的测试交给大家了。
总结:
Redis Sentinel实现高可用还是比较靠谱的,后面线上也打算使用。需要注意的是Redis Sentinel节点推荐3个以上。相比keepalived+redis实现高可用更靠谱,且keepalived+redis还不能管理多个实例,这点是比较麻烦的。
参考资料:
http://segmentfault.com/a/1190000002680804
http://segmentfault.com/a/1190000002685515