nagios的搭建与应用

一、搭建监控服务器
案例:在IP地址是192.168.4.77的服务器上部署Nagios监控服务器。

准备安装环境
[root@server yum.repos.d]# yum repolist
[root@server yum.repos.d]# rpm -q gcc gcc-c++
[root@server yum.repos.d]# groupadd nagcmd
[root@server yum.repos.d]# useradd nagios
[root@server yum.repos.d]# usermod -G nagcmd nagios

[root@server yum.repos.d]# yum -y install php
[root@server yum.repos.d]# yum -y install httpd
[root@server yum.repos.d]# echo 77> /var/www/html/index.html
[root@server yum.repos.d]# /etc/init.d/httpd start; chkconfig httpd on
[root@server yum.repos.d]# vim /var/www/html/test.php
<?php
echo "hello boy"
?>
:wq
[root@server yum.repos.d]# yum -y install elinks
[root@server yum.repos.d]# elinks --dump http:/localhost
[root@server yum.repos.d]# elinks --dump http:/localhost/test.php

[root@server yum.repos.d]# elinks -dump http://192.168.4.77
   77
[root@server yum.repos.d]# elinks -dump http://192.168.4.77/test.php
   hello boy
[root@server yum.repos.d]# rpm -qa gcc gcc-c++ php httpd
httpd-2.2.15-45.el6.x86_64
gcc-4.4.7-16.el6.x86_64
php-5.3.3-40.el6_6.x86_64
gcc-c++-4.4.7-16.el6.x86_64

安装nagios
[root@server ~]# cd nagios
[root@server nagios]# tar -xf nagios-3.2.1.tar.gz
[root@server nagios-3.2.1]# cd nagios-3.2.1
[root@server nagios-3.2.1]# ./config --help
[root@server nagios-3.2.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagcmd --with-command-user=nagios --with-command-group=nagcmd
[root@server nagios-3.2.1]# make all
[root@server nagios-3.2.1]# make install
[root@server nagios-3.2.1]# make install-init
[root@server nagios-3.2.1]# make install-commandmode
[root@server nagios-3.2.1]# make install-config
[root@server nagios-3.2.1]# make install-webconf
[root@server nagios-3.2.1]# ls /usr/local/nagios/
bin  etc  libexec  sbin  share  var

bin    命令存放的文件夹
etc    配置文件存放的文件夹
libexec    插件存放目录(nagios软件自带的一些监控命令)
sbin    cgi文件(已经安装的实现某种功能的命令)
share    nagios服务的网页文件
var    变化的数据(比如 日志文件)

1.3 安装nagios 监控插件
[root@server nagios]# tar -xf nagios-plugins-1.4.14.tar.gz
[root@server nagios]# cd nagios-plugins-1.4.14
[root@server nagios-plugins-1.4.14]# ./configure
[root@server nagios-plugins-1.4.14]# make
[root@server nagios-plugins-1.4.14]# make install
[root@server nagios-plugins-1.4.14]# ls /usr/local/nagios/libexec/check_*

nagios 服务是如何对主机的资源做监控的?
nagios 服务运行时,自动调用监控插件目录下的插件对指定服务器的资源做监控,在调用插件时,管理员可以配置,调用监控插件对资源做监控的值,值分2种:
    一种是警告值  数字  百分比
    一种是错误值  数字  百分比
如何监控对象的使用率小于 警告值  是正常状态 OK
如果监控对象的使用率大于警告值并且小于错误值  是警告状态  warning
如果监控对象的使用率大于错误值  是错误状态  critical

1.4 启动nagios 监控服务
* 默认不用做任何配置,nagios就对本机做监控服务
[root@server nagios-plugins-1.4.14]# service httpd restart
[root@server nagios-plugins-1.4.14]# /etc/rc.d/init.d/nagios start
[root@server nagios-plugins-1.4.14]# /etc/rc.d/init.d/nagios status

1.5 添访问nagios监控页面的认证用户
[root@server nagios-plugins-1.4.14]# vim /etc/httpd/conf.d/nagios.conf
AuthName "Nagios Access"
   AuthType Basic
   AuthUserFile /usr/local/nagios/etc/htpasswd.users
   Require valid-user
[root@server nagios-plugins-1.4.14]# htpasswd -c /usr/local/nagios/etc/htpasswd.users jim
New password:
Re-type new password:
Adding password for user jim
[root@server nagios-plugins-1.4.14]# cat /usr/local/nagios/etc/htpasswd.users
jim:UmR5.6.KoQcvY

访问页面是出现以下告警
It appears as though you do not have permission to view information for any of the services you requested...

If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI
and check the authorization options in your CGI configuration file.

用户名和账号不能随便定义,账号在/usr/local/nagios/etc/cgi.cfg里面已经定义
15 main_config_file=/usr/local/nagios/etc/nagios.cfg
24 physical_html_path=/usr/local/nagios/share
36 url_html_path=/nagios
119 authorized_for_system_information=nagiosadmin
131 authorized_for_configuration_information=nagiosadmin

[root@server etc]# rm -f /usr/local/nagios/etc/htpasswd.users
[root@server etc]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin
[root@server etc]# cat /usr/local/nagios/etc/htpasswd.users
nagiosadmin:xxFlLUqalWZGs


1.6 访问nagios 监控服务的WEB页面,查看监控信息。
ping 192.168.4.77
http://192.168.4.77/nagios

1.7 默认监控的资源    
Current Load    cpu负载
Current Users    登录的用户数
HTTP        网站服务
PING        是否在线
Root Partition    系统根分区的使用量
SSH        网站服务
Swap Usage    交换分区的使用量
Total Processes    系统总的进程数

nagios服务配置文件说明?
/usr/local/nagios/etc
cgi.cfg        定义访问CGI文件的用户
nagios.cfg    nagios服务的主配置文件 *
resource.cfg    宏定义文件(定义nagios服务使用的变量)
htpasswd.users    
/usr/local/nagios/etc/objects
commands.cfg    定义监控命令 *
timeperiods.cfg    定义监控时间模板
contacts.cfg    定义接收报警邮件的邮箱地址 *
templates.cfg    定义监控模板
localhost.cfg    监控本机配置
windows.cfg
switch.cfg
printer.cfg


1.8 监控插件的使用
cd /usr/local/nagios/libexec
./插件名 -h  查看帮助信息
./check_users -h

./check_users
[root@server libexec]# ./check_users -w 3 -c 5
USERS OK - 3 users currently logged in |users=3;3;5;0
[root@server libexec]# ./check_users -w 1 -c 2
USERS CRITICAL - 3 users currently logged in |users=3;1;2;0


./check_disk
[root@server libexec]# df -h
[root@server libexec]# ./check_disk -w 50% -c 30% -p /boot
[root@server libexec]# dd if=/dev/zero of=/boot/test.txt bs=1M count=400
[root@server libexec]# df -h
[root@server libexec]# ./check_disk -w 50% -c 30% -p /boot

./check_procs
[root@server libexec]# ./check_procs -w 90 -c 100 -s R

 ./check_http
[root@server libexec]# ./check_http -H 192.168.4.77 -p 80

./check_ssh
[root@server libexec]# ./check_ssh -H 192.168.4.77

./check_tcp
[root@server libexec]# ./check_tcp -H 127.0.0.1 -p 22
[root@server libexec]# ./check_tcp -H 172.40.58.140 -p 3128

+++++++++++++++++++++++++++++++++++++++++++
二、监控配置步骤
1、定义监控命令
define command{
    command_name    命令名
    command_line    设置使用的插件
 }
2、在监控服务器的配置文件里调用定义的监控命令
define service{
       use            监控模板名
       host_name        主机别名
       service_description    描述信息
       check_command    使用的监控命令
       }
3、在nagios.cfg文件里加载监控服务器的配置文件
cfg_file=/usr/local/nagios/etc/objects/监控服务器的主配置文件
4、验证nagios.cfg配置格式
[root@server objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Total Warnings: 0
Total Errors:   0
5、重启监控服务
[root@server ~]# /etc/init.d/nagios restat
6、访问监控页面查看监控信息
http://monitor-ip/nagios/

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2.1 监控远端服务器
监控服务器172.40.50.67服务器上网站服务和ftp服务、登录用户数量、引导分区使用量、进程数

1、定义监控命令
[root@server objects]# vim /usr/local/nagios/etc/objects/commands.cfg
vim
define command {
         command_name check_67_ftp
         command_line $USER1$/check_ftp -H 192.168.4.67
}

define command {
         command_name check_67_httpd
         command_line $USER1$/check_http -H 192.168.4.67
}

2、在监控服务器的配置文件里调用定义的监控命令
[root@server objects]# vim /usr/local/nagios/etc/objects/192.168.4.67.cfg
define host{
        use                     linux-server
        host_name               ser67
        alias                   This is my server
        address                 192.168.4.67
        }

define service{
        use                             local-service
        host_name                       ser67
        service_description             ftp
    check_command            check_67_ftp
        }

define service{
        use                             local-service
        host_name                       ser67
        service_description             httpd
    check_command            check_67_httpd
        }

3、在nagios.cfg文件里加载监控服务器的配置文件
[root@server objects]# vim /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/192.168.4.67.cfg

4、验证nagios.cfg配置格式
[root@server objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

5、重启监控服务
[root@server ~]# /etc/init.d/nagios restat

6、访问监控页面查看监控信息
http://192.168.4.67/nagios/

+++++++++++++++++++++++++++++++++++++++++++++++++

配置监控服务器监控远端服务器上的私有数据(登录用户数量、引导分区使用量、进程数)
NRPE----> 私有数据

远端服务器的配置(192.168.4.67)
1、安装监控插件
[root@web1 nagios]# tar xf nagios-plugins-1.4.14.tar.gz
[root@web1 nagios]# cd nagios-plugins-1.4.14
[root@web1 nagios-plugins-1.4.14]# yum -y install gcc gcc-c++
[root@web1 nagios-plugins-1.4.14]# ./configure
[root@web1 nagios-plugins-1.4.14]# make&&make install
[root@web1 nagios-plugins-1.4.14]# ls /usr/local/nagios/libexec/check_*

[root@web1 ~]# /usr/local/nagios/libexec/check_users -w 3 -c 5
USERS OK - 3 users currently logged in |users=3;3;5;0

[root@web1 ~]# /usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /boot
DISK OK - free space: /boot 415 MB (92% inode=99%);| /boot=35MB;380;428;0;476

[root@web1 ~]# /usr/local/nagios/libexec/check_procs -w 50 -c 60
PROCS CRITICAL: 101 processes


2、运行nrpe服务
在192.168.4.67
安装nrpe
[root@web1 nrpe-2.12]# useradd nagios
[root@web1 nrpe-2.12]# groupadd nagcmd
[root@web1 nrpe-2.12]# usermod -G nagcmd nagios
[root@web1 nrpe-2.12]# tar xf nrpe-2.12.tar.gz
[root@web1 nrpe-2.12]# cd nrpe-2.12
[root@web1 nrpe-2.12]# yum -y install openssl-devel
[root@web1 nrpe-2.12]# ./configure&&make && make install
[root@web1 nrpe-2.12]# make install-plugin
[root@web1 nrpe-2.12]# make install-daemon
[root@web1 nrpe-2.12]# make install-daemon-config
[root@web1 nrpe-2.12]# make install-xinetd

[root@web1 nagios]# grep only_from /etc/xinetd.d/nrpe
    only_from       = 127.0.0.1 192.168.4.77
[root@web1 nagios]# tail -1 /etc/services
nrpe        5666/tcp        # NRPE
[root@web1 nagios]# yum -y install xinetd
[root@web1 nagios]# /etc/init.d/xinetd restart
[root@web1 nagios]# chkconfig xinetd on
[root@web1 nagios]# netstat -anptu |grep :5666
tcp        0      0 :::5666                     :::*                        LISTEN      2033/xinetd     

修改NRP服务的配置文件nrpe.cfg,定义监控命令
command[命令名]=使用的插件
[root@web1 nagios]# sed -n '199,203'p /usr/local/nagios/etc/nrpe.cfg
command[check_67_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_67_cpu]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_67_boot]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /boot
command[check_67_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_67_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
 

[root@web1 libexec]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_67_zombie_procs


监控服务器的配置 192.168.4.77
1、安装check_nrpe插件
[root@server nagios]# tar -xf nrpe-2.12.tar.gz
[root@server nagios]# cd nrpe-2.12
[root@server nrpe-2.12]# yum -y install openssl-devel
[root@server nrpe-2.12]# ./configure
[root@server nrpe-2.12]# make all
[root@server nrpe-2.12]# make install-plugin
[root@server nrpe-2.12]# ls /usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe

2、使用插件连接被监控主机的nrpe服务并测试定义的监控命令
[root@server nrpe-2.12]# /usr/local/nagios/libexec/check_nrpe -H 192.168.4.67 -c check_67_users
USERS OK - 2 users currently logged in |users=2;5;10;0
[root@server nrpe-2.12]# /usr/local/nagios/libexec/check_nrpe -H 192.168.4.67 -c check_67_boot
DISK OK - free space: /boot 415 MB (92% inode=99%);| /boot=35MB;380;428;0;476

1、定义监控命令
vim commands.cfg
define command{
               command_name   check_ser_67_users
               command_line       $USER$/check_nrpe -H 192.168.4.67 -c check_67_users
}

define command{
               command_name   check_ser_67_boot
               command_line      $USER$/check_nrpe -H 192.168.4.67 -c check_67_boot
}

define command{
               command_name   check_ser_67_total_process
               command_line      $USER$/check_nrpe -H 192.168.4.67 -c check_67_total_process
}

define command{
               command_name   check_ser_67_zombie_process
               command_line      $USER$/check_nrpe -H 192.168.4.67 -c check_67_zombie_process
}

2、在监控远端服务器的配置文件里调用定义的监控命令
vim 192.168.4.67.cfg
define service{
        use                         local-service
        host_name                       ser67
        service_description           user
        check_command    check_ser_67_users
        }

define service{
        use                             local-service
        host_name                       ser67
        service_description             boot
    check_command            check_ser_67_boot
        }

define service{
        use                             local-service
        host_name                       ser67
        service_description             totalprocess
    check_command            check_ser_67_total_process
        }

define service{
        use                             local-service
        host_name                       ser67
        service_description             zombieprocess
    check_command            check_ser_67_zombie_process
        }

3、在nagios.cfg文件里加载监控服务器的配置文件
[root@server objects]# vim /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/192.168.4.67.cfg

4、验证nagios.cfg配置格式
[root@server objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

5、重启监控服务
[root@server ~]# /etc/init.d/nagios restat

6、访问监控页面查看监控信息
http://192.168.4.67/nagios/

+++++++++++++++++++++++++++++++++++++++++++++++++++++++

2.2 配置监控本机    localhost.cfg

2.3 自定义监控本机资源
监控本机引导分区的使用情况
不监控本机的交换分区
监控本机ftp服务的状态

+++++++++++++++++++++++++++++++++++++++++++++++++++++
1、定义监控命令
define command {
         command_name check_local_boot
         command_line $USER1$/check_disk -w 20% -c 10% -p /boot
}

2、在监控服务器的配置文件里调用定义的监控命令
define command {
         command_name check_local_ftp
         command_line $USER1$/check_ftp -H localhost
}
define service{
        use                             local-service
        host_name                       localhost
        service_description             ftp
    check_command            check_local_ftp
        }

define service{
        use                             local-service
        host_name                       localhost
        service_description             boot
    check_command            check_local_boot
        }

3、在nagios.cfg文件里加载监控服务器的配置文件
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

4、验证nagios.cfg配置格式
[root@server objects]# alias checknagios='/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg'
[root@server objects]# checknagios

5、重启监控服务
[root@server ~]# /etc/init.d/nagios restat
6、访问监控页面查看监控信息
http://192.168.4.77/nagios/


2.4 配置监控报警

[root@server objects]# grep email  /usr/local/nagios/etc/objects/contacts.cfg
        email                           nagios@localhost    ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
                                            lyd@163.com
grep nagios /etc/passwd
/etc/init.d/postfix start

++++++++++++++++++++++++++++++++++++++++++++++
扩展知识
1、使用第三方邮件服务器发送报警邮件
     sendmail
2、主机依赖
3、服务依赖
4、改变报警方式(微信报警  短信报警)

上一篇:Nagios 监控Windows服务器(详细篇)


下一篇:hdu 4292 网络流