Nagios&Cacti

Nagios + Cacti 其实在易用性上是比不上zabbix的,但是对于仅仅需要报警而无需图表的服务监控,nagios 的确比较好,之前由于IDC迁移,就把之前老的那套nagios+cacti 环境重新部署了一次。

Nagios:

  • 准备工作:
apt-get install autoconf gcc libc6 build-essential bc gawk dc gettext \
libmcrypt-dev libssl-dev make unzip apache2 apache2-utils php5 libgd2-xpm-dev
/usr/sbin/useradd -m -s /bin/bash nagios #创建用户
/usr/sbin/groupadd nagcmd #创建ganioscmd 用户,用于执行一些外部命令,比如nrpe
/usr/sbin/usermod -a -G nagcmd nagios
/usr/sbin/usermod -a -G nagcmd www-data
  • 安装:
tar zxvf nagios-4.3.1.tar.gz
cd nagios-4.3.1.tar.gz
./configure --prefix=/opt/nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled
make all
make install
make install-init
make install-config
make install-commandmode
update-rc.d nagios defaults #初始化各种配置以及增加开启启动
  • nagios目录:
root@10.1.1.208:nagios# ls
bin  etc  libexec  log  sbin  share  var

其中nagios主要配置文件在etc 下,而插件主要则放在libexec下。

  • 配置nagios:
    公司的nagios 主要用来监控一些服务器的硬件状态,比如磁盘是否完好等等,而且均通过nrpe的方式进行监控,用于减少本地服务器负担。nagios的配置为分布式的,可以根据需要将多个配置注册在总的nagios.cfg 配置里。
# You can specify individual object config files as shown below:
cfg_file=/opt/nagios/etc/objects/commands.cfg
cfg_file=/opt/nagios/etc/objects/contacts.cfg
cfg_file=/opt/nagios/etc/objects/timeperiods.cfg
cfg_file=/opt/nagios/etc/objects/templates.cfg
#
cfg_file=/opt/nagios/etc/objects/service.cfg
cfg_file=/opt/nagios/etc/objects/group.cfg
# Definitions for monitoring the local (Linux) host
#cfg_file=/opt/nagios/etc/objects/localhost.cfg
cfg_file=/opt/nagios/etc/objects/host_debian.cfg
cfg_file=/opt/nagios/etc/objects/host_centos.cfg

然后对应编辑目录就行了,假设我要添加一台linux 服务器,用于监控硬盘信息,需要如下步骤:
1 .修改commands.cfg 配置,增加对应command:

# check hardware Disk
define command{
        command_name check_storage_disk_nrpe
        command_line /opt/nagios/libexec/check_storage_disk_nrpe $HOSTADDRESS$ check_storage_disk
}

libexec下放对应的脚本,大致意思就是nagios远程机器执行check_storage_disk 模块,而check_storage_disk 就是远程机器的一个监控脚本。

#!/bin/bash
PLUGINS=/opt/nagios/libexec
CHECK_NRPE=$PLUGINS/check_nrpe
host=$1
comm=$2
if [ $# -lt 2 ];then
    echo "Usage: $0 host command"
    exit 2
fi
#command_line    $USER1$/check_snmp_traffic $HOSTADDRESS$ public 3 " > 80 " " > 90 "
res=`$CHECK_NRPE -H$host -n -p57000 -c $comm`
if [ $? -ne 0 ];then
    if [ "CHECK_NRPE: Socket timeout after 10 seconds." == ${res} ];then
        echo "connect failed"
        exit 0
    else
        echo "Check Storage UNKNOWN"
        exit 3
    fi
fi
if [ "${res}" == "Storage Disk Normal" ];then
    echo "Check Storage OK"
    exit 0
else
    echo "${res}"
    exit 2
fi
echo $res
exit $EXIT

nrpe 插件可以在nagios.org里下载。
然后将该服务注册到service.cfg 中:

define service{
        use                             local-service
        hostgroup_name                  debian_servers
        service_description             hardware_disk_check
        check_command                   check_storage_disk_nrpe
        }

然后创建host 配置以及host group 配置:

define hostgroup{
        hostgroup_name  debian_servers
        alias           servers
        members         test
        }
define host{
        use                     linux-server
        host_name              test
        alias                   01
        address                 192.168.1.1
        }

nagios 登录是通过apache htpass 做验证的,比较简单,修改对应的cgi的密码就行。修改nagios登录用户需要修改apache的htpasswd之外,还需要修改cgi.cfg 里的用户认证。
然后检查nagios 配置:

/opt/nagios/bin/nagios -v /opt/nagios/etc/nagios.cfg 

然后启动nagios
nagios 编译安装默认没有在init下有启动服务的脚本:
这里贴一个:

#!/bin/sh
# 
# chkconfig: 345 99 01
# description: Nagios network monitor
#
# File : nagios
#
# Author : Jorge Sanchez Aymar (jsanchez@lanchile.cl)
# 
# Changelog :
#
# 1999-07-09 Karl DeBisschop <kdebisschop@infoplease.com>
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad <egalstad@nagios.org>
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <jimpop@rocketship.com>
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <egalstad@nagios.org>
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <kdebisschop@users.sourceforge.net>
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <egalstad@nagios.org>
#  - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
#              used to provide network services status.
#
  
status_nagios ()
{

    if test -x $NagiosCGI/daemonchk.cgi; then
        if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile; then
                return 0
        else
            return 1
        fi
    else
        if ps -p $NagiosPID > /dev/null 2>&1; then
                return 0
        else
            return 1
        fi
    fi

    return 1
}


printstatus_nagios()
{

    if status_nagios $1 $2; then
        echo "nagios (pid $NagiosPID) is running..."
    else
        echo "nagios is not running"
    fi
}


killproc_nagios ()
{

    kill $2 $NagiosPID

}


pid_nagios ()
{

    if test ! -f $NagiosRunFile; then
        echo "No lock file found in $NagiosRunFile"
        exit 1
    fi

    NagiosPID=`head -n 1 $NagiosRunFile`
}


# Source function library
# Solaris doesn't have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
    . /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
    . /etc/init.d/functions
fi

prefix=/opt/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=${prefix}/var/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagios
          

# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
    echo "Executable file $NagiosBin not found.  Exiting."
    exit 1
fi

# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
    echo "Configuration file $NagiosCfgFile not found.  Exiting."
    exit 1
fi
          
# See how we were called.
case "$1" in

    start)
        echo -n "Starting nagios:"
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            su - $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
            rm -f $NagiosCommandFile
            touch $NagiosRunFile
            chown $NagiosUser:$NagiosGroup $NagiosRunFile
            $NagiosBin -d $NagiosCfgFile
            if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
            echo " done."
            exit 0
        else
            echo "CONFIG ERROR!  Start aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    stop)
        echo -n "Stopping nagios: "

        pid_nagios
        killproc_nagios nagios

        # now we have to wait for nagios to exit and remove its
        # own NagiosRunFile, otherwise a following "start" could
        # happen, and then the exiting nagios will remove the
        # new NagiosRunFile, allowing multiple nagios daemons
        # to (sooner or later) run - John Sellens
        #echo -n 'Waiting for nagios to exit .'
        for i in 1 2 3 4 5 6 7 8 9 10 ; do
            if status_nagios > /dev/null; then
            echo -n '.'
            sleep 1
            else
            break
            fi
        done
        if status_nagios > /dev/null; then
            echo ''
            echo 'Warning - nagios did not exit in a timely manner'
        else
            echo 'done.'
        fi

        rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
        ;;

    status)
        pid_nagios
        printstatus_nagios nagios
        ;;

    checkconfig)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo " OK."
        else
            echo " CONFIG ERROR!  Check your Nagios configuration."
            exit 1
        fi
        ;;

    restart)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo "done."
            $0 stop
            $0 start
        else
            echo " CONFIG ERROR!  Restart aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    reload|force-reload)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo "done."
            if test ! -f $NagiosRunFile; then
                $0 start
            else
                pid_nagios
                if status_nagios > /dev/null; then
                    printf "Reloading nagios configuration..."
                    killproc_nagios nagios -HUP
                    echo "done"
                else
                    $0 stop
                    $0 start
                fi
            fi
        else
            echo " CONFIG ERROR!  Reload aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    *)
        echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}"
        exit 1
        ;;

esac
  
# End of this script

然后登录检查即可。

cacti

cacti 用于监控出图,其实nagios 可以通过pnp4nagios 进行出图,就是体验不是太好,cacti 用于定制化监控图表还是很不错的,虽然大家用的都是rrdtool。

  • 准备
apt-get install rrdtool  php5 mysql-server

其实php5不止要装那么点包,这个之后再说。
下载cacti 后解压进入目录,登录mysql 导入cacti 对应数据表:

mysql> create database cacti;
mysql>use cacti;
Query OK, 1 row affected (0.00 sec)
mysql> source cacti.sql;
mysql> GRANT ALL PRIVILEGES ON cacti.* TO 'cacti'@'127.0.0.1' IDENTIFIED BY 'cacti';

修改配置文件:

vi include/config.php
$database_type     = 'mysql';
$database_default  = 'cacti';
$database_hostname = '127.0.0.1';
$database_username = 'cacti';
$database_password = 'cacti';
$database_port     = '3306';
$database_ssl      = false;

之后登录ip/cacti 后会出现安装配置界面:
默认用户admin 密码admin


Paste_Image.png

这里会提示缺少哪些包,装上即可:

Paste_Image.png

新版本的cacti 有个问题在于mysql 是时区权限。就是上图那个报错,需要修复一下:

mysql> GRANT SELECT ON mysql.time_zone_name TO cacti@'127.0.0.1';
mysql_tzinfo_to_sql /usr/share/zoneinfo/ | mysql -u root -p mysql

之后next 变安装完成。

Paste_Image.png

之后就配置snmp 进行监控和出图啦。

地址收藏:
http://exchange.nagios.org
http://forums.cacti.net

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,904评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,581评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,527评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,463评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,546评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,572评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,582评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,330评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,776评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,087评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,257评论 1 344
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,923评论 5 338
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,571评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,192评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,436评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,145评论 2 366
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,127评论 2 352

推荐阅读更多精彩内容