简单的Redis及哨兵监控报警
前段时间给第三方客户部署了redis主从+读写分离+哨兵的集群,需要简单配置一个报警(毕竟人家服务器不好意思装zabbix)一、配置Linux服务器从第三方 SMTP 服务器外发邮件
1、确保postfix服务运行
# systemctl status postfix2、安装mailx
# yum install -y mailx3、配置smtp服务器
修改/etc/mail.rc文件,在文件中添加以下内容 # vim /etc/mail.rcset from=user_sunli@sina.com set smtp=smtp.sina.com set smtp-port=465 set smtp-auth-user=user_sunli@sina.com set smtp-auth-password=xxxxxxxxxxxx set smtp-auth=login
4、测试
# echo "邮件内容" |mail -s "邮件标题" 公网邮箱 # echo "hello" |mail -s "hehehe" sunli@bdszh.vip 详细可看这里:https://www.cnblogs.com/user-sunli/p/14221617.html二、监控脚本及定时任务
安装nc
yum -y install nc
编写脚本
vim /data/scripts/redis_mail.sh#!/bin/bash local_ip=`hostname -I|awk '{print $1}'` netstat -tnlp|grep 56379 [ `echo $?` != 0 ] && systemctl restart redis.service && echo "Please check $local_ip redis " |mail -s "redis is down" sunli@bdszh.vip netstat -tnlp|grep 46379 [ `echo $?` != 0 ] && systemctl restart sentinel.service && echo "Please check $local_ip sentinel " |mail -s "sentinel is down" sunli@bdszh.vip nc -zvw3 10.0.36.132 56379 [ `echo $?` != 0 ] && ansible 10.0.36.132 -m systemd -a "name=redis state=restarted" && echo "Please check 10.0.36.132 redis " |mail -s "redis is down" sunli@bdszh.vip nc -zvw3 10.0.36.132 46379 [ `echo $?` != 0 ] && ansible 10.0.36.132 -m systemd -a "name=sentinel state=restarted" && echo "Please check 10.0.36.132 sentinel " |mail -s "sentinel is down" sunli@bdszh.vip nc -zvw3 10.0.36.134 56379 [ `echo $?` != 0 ] && ansible 10.0.36.134 -m systemd -a "name=redis state=restarted" && echo "Please check 10.0.36.134 redis " |mail -s "redis is down" sunli@bdszh.vip nc -zvw3 10.0.36.134 46379 [ `echo $?` != 0 ] && ansible 10.0.36.134 -m systemd -a "name=sentinel state=restarted" && echo "Please check 10.0.36.134 sentinel " |mail -s "sentinel is down" sunli@bdszh.vip
定时任务
在linux中 crontab的最小执行单位是分钟,没法直接实现单位秒的运行,所以得通过其他方式来处理。 思路:假如每5秒运行一次,那就运行一次后睡眠5秒,5秒后再睡眠5秒,依次类推 # crontab -e*/1 * * * * /data/scripts/redis_mail.sh */1 * * * * sleep 5; /data/scripts/redis_mail.sh */1 * * * * sleep 10; /data/scripts/redis_mail.sh */1 * * * * sleep 15; /data/scripts/redis_mail.sh */1 * * * * sleep 20; /data/scripts/redis_mail.sh */1 * * * * sleep 25; /data/scripts/redis_mail.sh */1 * * * * sleep 30; /data/scripts/redis_mail.sh */1 * * * * sleep 35; /data/scripts/redis_mail.sh */1 * * * * sleep 40; /data/scripts/redis_mail.sh */1 * * * * sleep 45; /data/scripts/redis_mail.sh */1 * * * * sleep 50; /data/scripts/redis_mail.sh */1 * * * * sleep 55; /data/scripts/redis_mail.sh
三、模拟故障情况
自行停止redis或者哨兵