Centos 5 配置nagios监控系统

我们的网络和服务器要被细心的照看,要不然你会有很多麻烦的Centos 5 配置nagios监控系统 还是找个工具来帮助我们管理这些让人牵挂的机器吧----nagios

在这里简单的配置了一下监控主机,深入的配置需要多多学习,网上的资料很多
下面简单的写出来我自己的配置,也许有不对的地方,请指出,谢谢,您的参阅

1、创建nagios用户
adduser nagios
mkdir /usr/local/nagios
chown nagios.nagios /usr/local/nagios

3、解压下载的安装包
tar xzvf nagios-version.tar.gz

2、编译
进入到解压目录执行
./configure --prefix=/usr/local/nagios 
--with-nagios-user=nagios 
--with-nagios-group=nagios
--with-command-group= nagios
make all 
make install 
make install-config 
make install-init 
nagios安装完成
3、创建访问nagios的认证用户
/usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd andy
按照提示设置密码。
Apache已经安装好 到目录/usr/local/apache/
4、将nagios的信息加到apache中,打开/usr/local/apache/conf/httpd.conf文件,在文件最后添加如下代码:
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
<Directory "/usr/local/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
</Directory>

Alias /nagios /usr/local/nagios/share
<Directory "/usr/local/nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user

</Directory>
5、重启apache

访问nagios http://IP/nagios 

6、插件安装

本文中用到的是 nagios-plugins-1.4.13.tar.gz
解压缩
Tar zxvf nagios-plugins-1.4.13.tar.gz
进入到解压缩目录nagios-plugins-1.4.13
编译
./configure –prefix=/usr/local/nagios  #和nagios安装目录相同
安装
Make 
Make install

7、nagios 具体配置

(一)
首先编辑/usr/local/nagios/etc/objects/localhost.cfg、我把要监控的主机都放在localhost.cfg文件中
================================================
定义要监控的主机 (oracle.test.com、CVS)
define host{
        use                     linux-server
        host_name               oracle.test.com
        alias                   Database
        address                 192.168.1.176
        contact_groups          admins
        parents                 routergw
        icon_image              server.gif
        statusmap_image         server.gd2
        2d_coords               500,200
        3d_coords               500,200,100
        }
define host{
        use                     linux-server
        host_name               CVS
        alias                   CVS
        address                 192.168.1.183
        contact_groups          admins
        parents                 routergw
        icon_image              server.gif
        statusmap_image         server.gd2
        2d_coords               500,200
        3d_coords               500,200,100
        }
================================================
定义主机组和服务组
define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        members         *     ; Comma separated list of hosts that belong to this group
        }
define servicegroup{
        servicegroup_name linuxserv
        alias services
        members  oracle.test.com,SSH,oracle.test.com,PING,oracle.test.com,disk,CVS,PING,CVS,disk,CVS,SSH
        }
================================================
定义要监控的服务
define service{
        use                              local-service         
        host_name                        oracle.test.com,CVS
        service_description                 PING
        check_command                   check_ping!100.0,20%!500.0,60%
        notifications_enabled           1
        }
# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
        use                             local-service        te
        host_name                       oracle.test.com,CVS
        service_description                 disk
        check_command                   check_local_disk!20%!10%!/
        notifications_enabled           1
        }
define service{
        use                             local-service        
        host_name                       oracle.test.com,CVS
        service_description                users
        check_command                   check_local_users!20!50
        notifications_enabled           1
        }
# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.
define service{
        use                             local-service 
        host_name                       oracle.test.com,CVS
        service_description                procs
        check_command                  check_local_procs!250!400!RSZDT
        notifications_enabled           1
        }

# Define a service to check the load on the local machine.
define service{
        use                             local-service        
        host_name                       oracle.test.com,CVS
        service_description                local_load
        check_command                  check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
        notifications_enabled           1

        }
define service{
        use                             local-service        
        host_name                       oracle.test.com,CVS
        service_description                 swap
        check_command                   check_local_swap!20!10
        notifications_enabled           1
        }

# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
        use                             local-service        
        host_name                       oracle.test.com,CVS
        service_description                SSH
        check_command                   check_tcp!22!1.0!10.0
        notifications_enabled           1
        }

# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
        use                             local-service         
        host_name                       oracle.test.com
        service_description                 HTTP
        check_command                   check_http
        notifications_enabled           1
        }
================================================
(二)编辑联系人/usr/local/nagios/etc/objects/contacts.cfg
define contact{
        contact_name                    nagiosadmin            
        use                             generic-contact        
        alias                           Nagios Admin            
        service_notification_commands    notify-service-by-sms, notify-service-by-email
        host_notification_commands       notify-host-by-sms, notify-service-by-email

        email                         andylhz@XX.com    紧急事件发送邮件地址      
        pager                         138*****       紧急事件发送手机短信号码
        }
===============================================
(三)编辑/usr/local/nagios/etc/objects/commands.cfg 命令定义文件

定义服务命令
define command{
        command_name notify-service-by-sms
        command_line /usr/bin/sms -f 138******5 -p ******  -t $CONTACTPAGER$ -m "$HOSTNAME$ $SERVICEDESC$ is $SERVICESTATE$on $TIME$ result is$SERVICEOUTPUT$" $CONTACTPAGER$
}

定义主机命令
define command{
        command_name notify-host-by-sms
        command_line /usr/bin/sms -f 138******5 -p ******  -t $CONTACTPAGER$ -m "Host $HOSTSTATE$ alert for $HOSTNAME$! on '$DATETIME$' " $CONTACTPAGER$
       }
===============================================
注意::上面的发送短信程序sms 需要单独安装,网上很多,在此不做说明

测试配置是否正确
/usr/local/nagios/bin/nagios –v /usr/local/nagios/etc/nagios.cfg 
如果没有问题的话回有如下的显示那就表是没有问题了
Nagios Core 3.2.0
Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2009
License: GPL
Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch.cfg'...
   Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
        Checked 16 services.
Checking hosts...
        Checked 3 hosts.
Checking host groups...
        Checked 2 host groups.
Checking service groups...
        Checked 1 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 26 commands.
Checking time periods...
        Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors:   0

启动 nagios
 service nagios start 

 





     本文转自andylhz 51CTO博客,原文链接:http://blog.51cto.com/andylhz2009/211044,如需转载请自行联系原作者

上一篇:【Redis】Redis 主从复制之一


下一篇:Aix 光盘软件包安装