简单的Redis及哨兵监控报警

简单的Redis及哨兵监控报警

 前段时间给第三方客户部署了redis主从+读写分离+哨兵的集群,需要简单配置一个报警(毕竟人家服务器不好意思装zabbix)

一、配置Linux服务器从第三方 SMTP 服务器外发邮件

1、确保postfix服务运行

# systemctl status postfix

2、安装mailx

# yum install -y mailx

3、配置smtp服务器

    修改/etc/mail.rc文件,在文件中添加以下内容 # vim /etc/mail.rc
set from=user_sunli@sina.com
set smtp=smtp.sina.com
set smtp-port=465
set smtp-auth-user=user_sunli@sina.com
set smtp-auth-password=xxxxxxxxxxxx
set smtp-auth=login
 

4、测试

# echo "邮件内容" |mail -s "邮件标题" 公网邮箱 # echo "hello" |mail -s "hehehe" sunli@bdszh.vip    详细可看这里:https://www.cnblogs.com/user-sunli/p/14221617.html

二、监控脚本及定时任务

安装nc

yum -y install nc

 

编写脚本

vim /data/scripts/redis_mail.sh
#!/bin/bash
local_ip=`hostname -I|awk '{print $1}'`
netstat -tnlp|grep 56379
[ `echo $?` != 0 ] && systemctl restart redis.service && echo "Please check $local_ip redis " |mail -s "redis is down" sunli@bdszh.vip
netstat -tnlp|grep 46379
[ `echo $?` != 0 ] && systemctl restart sentinel.service && echo "Please check $local_ip sentinel " |mail -s "sentinel is down" sunli@bdszh.vip
nc -zvw3 10.0.36.132 56379
[ `echo $?` != 0 ] && ansible 10.0.36.132 -m systemd -a "name=redis state=restarted" && echo "Please check 10.0.36.132 redis " |mail -s "redis is down" sunli@bdszh.vip
nc -zvw3 10.0.36.132 46379
[ `echo $?` != 0 ] && ansible 10.0.36.132 -m systemd -a "name=sentinel state=restarted" && echo "Please check 10.0.36.132 sentinel " |mail -s "sentinel is down" sunli@bdszh.vip
nc -zvw3 10.0.36.134 56379
[ `echo $?` != 0 ] && ansible 10.0.36.134 -m systemd -a "name=redis state=restarted" && echo "Please check 10.0.36.134 redis " |mail -s "redis is down" sunli@bdszh.vip
nc -zvw3 10.0.36.134 46379
[ `echo $?` != 0 ] && ansible 10.0.36.134 -m systemd -a "name=sentinel state=restarted" && echo "Please check 10.0.36.134 sentinel " |mail -s "sentinel is down" sunli@bdszh.vip
 

 定时任务

在linux中 crontab的最小执行单位是分钟,没法直接实现单位秒的运行,所以得通过其他方式来处理。 思路:假如每5秒运行一次,那就运行一次后睡眠5秒,5秒后再睡眠5秒,依次类推 # crontab -e
*/1 * * * *  /data/scripts/redis_mail.sh
*/1 * * * * sleep 5; /data/scripts/redis_mail.sh
*/1 * * * * sleep 10; /data/scripts/redis_mail.sh
*/1 * * * * sleep 15; /data/scripts/redis_mail.sh
*/1 * * * * sleep 20; /data/scripts/redis_mail.sh
*/1 * * * * sleep 25; /data/scripts/redis_mail.sh
*/1 * * * * sleep 30; /data/scripts/redis_mail.sh
*/1 * * * * sleep 35; /data/scripts/redis_mail.sh
*/1 * * * * sleep 40; /data/scripts/redis_mail.sh
*/1 * * * * sleep 45; /data/scripts/redis_mail.sh
*/1 * * * * sleep 50; /data/scripts/redis_mail.sh
*/1 * * * * sleep 55; /data/scripts/redis_mail.sh

 

三、模拟故障情况

自行停止redis或者哨兵

       
上一篇:【JAVA零基础入门系列】Day15 对象的比较


下一篇:CentOS中的"resolv.conf"文件被重置的解决方案