<< Back to man.ChinaUnix.net

3. Getting Started

Some people freak out at this part, but this is where the fun starts. It's where you get to configure the actual server and hosts (among other things).

Tip

If you're really against editing the files manually (or have a very large network), there is a Perl script called nmap2nagios.pl that can be downloaded from SourceForge. If you're installing nmap2nagios.pl, you may have to force install the XML::Simple Perl module (well I have to on FreeBSD 5.4, 6.0 and OS X) which is a dependency.

monitor# cd /usr/local/etc/nagios/
monitor# ls -al
total 104
drwxr-xr-x  2 root  wheel    512 Nov  7 17:56 .
drwxr-xr-x  6 root  wheel    512 Nov  8 12:17 ..
-rw-r--r--  1 root  wheel  30046 Nov  7 17:56 bigger.cfg-sample
-rw-r--r--  1 root  wheel   9569 Nov  7 17:56 cgi.cfg-sample
-rw-r--r--  1 root  wheel   4475 Nov  7 17:56 checkcommands.cfg-sample
-rw-r--r--  1 root  wheel  13602 Nov  7 17:56 minimal.cfg-sample
-rw-r--r--  1 root  wheel   4297 Nov  7 17:56 misccommands.cfg-sample
-rw-r--r--  1 root  wheel  30735 Nov  7 17:56 nagios.cfg-sample
-rw-r--r--  1 root  wheel   1335 Nov  7 17:56 resource.cfg-sample

You will notice that all the config files are there ready to use, all you have to do is rename them to .cfg rather than sample. The structure and use of the config files is covered thoroughly in the Nagios documentation. To keep things simple, we'll be using a basic configuration using the minimal.cfg.

Manually backup files into a separate directory, to reduce clutter and rename the samples to make them the live config files.

monitor# mkdir samples
monitor# cp * /samples/*
monitor# ls
bigger.cfg-sample               checkcommands.cfg-sample        misccommands.cfg-sample         resource.cfg-sample
cgi.cfg-sample                  minimal.cfg-sample              nagios.cfg-sample               samples
monitor# mv bigger.cfg-sample bigger.cfg 
monitor# mv cgi.cfg-sample cgi.cfg
monitor# mv checkcommands.cfg-sample checkcommands.cfg
monitor# mv minimal.cfg-sample minimal.cfg
monitor# mv misccommands.cfg-sample misccommands.cfg
monitor# mv nagios.cfg-sample nagios.cfg
monitor# mv resource.cfg-sample resource.cfg
monitor# ls -al
total 76
drwxr-xr-x  3 root  wheel    512 Nov  9 16:24 .
drwxr-xr-x  6 root  wheel    512 Nov  8 12:17 ..
-rw-r--r--  1 root  wheel  30046 Nov  7 17:56 bigger.cfg
-rw-r--r--  1 root  wheel   9569 Nov  7 17:56 cgi.cfg
-rw-r--r--  1 root  wheel   4475 Nov  7 17:56 checkcommands.cfg
-rw-r--r--  1 root  wheel  13602 Nov  7 17:56 minimal.cfg
-rw-r--r--  1 root  wheel   4297 Nov  7 17:56 misccommands.cfg
-rw-r--r--  1 root  wheel  30735 Nov  7 17:56 nagios.cfg
-rw-r--r--  1 root  wheel   1335 Nov  7 17:56 resource.cfg
drwxr-xr-x  2 root  wheel    512 Nov  9 16:22 samples

Now we have all the files in place, we're almost there. All we have to do is edit a few files to configure our system and to tell Nagios about what we want to monitor. There are only 3 files you need to edit to get a minimal system up and running.

Tip

If you're using FreeBSD, all of the Nagios config files are located in: /usr/local/etc/nagios

nagios.cfg

Open nagios.cfg and make the following changes:

Comment out the following lines.

#cfg_file=/usr/local/etc/nagios/checkcommands.cfg
#cfg_file=/usr/local/etc/nagios/misccommands.cfg

minimal.cfg

Open minimal.cfg and make the following changes:

###############################################################################
# MINIMAL.CFG
#
# MINIMALISTIC OBJECT CONFIG FILE (Template-Based Object File Format)
#
# Last Modified: 03-23-2005
#
#
# NOTE: This config file is intended to be used to test a Nagios installation
#       that has been compiled with support for the template-based object
#       configuration files.
#
#       This config file is intended to servce as an *extremely* simple 
#       example of how you can create your object configuration file(s).
#       If you're interested in more complex object configuration files for
#       Nagios, look in the sample-config/template-object/ subdirectory of
#       the distribution.
#
###############################################################################



###############################################################################
###############################################################################
#
# TIME PERIODS
#
###############################################################################
###############################################################################

# This defines a timeperiod where all times are valid for checks, 
# notifications, etc.  The classic "24x7" support nightmare. :-)

define timeperiod{
        timeperiod_name 24x7
        alias           24 Hours A Day, 7 Days A Week
        sunday          00:00-24:00
        monday          00:00-24:00
        tuesday         00:00-24:00
        wednesday       00:00-24:00
        thursday        00:00-24:00
        friday          00:00-24:00
        saturday        00:00-24:00
        }



###############################################################################
###############################################################################
#
# COMMANDS
#
###############################################################################
###############################################################################

# This is a sample service notification command that can be used to send email 
# notifications (about service alerts) to contacts.

define command{
	command_name	notify-by-email
	command_line	/usr/bin/printf "%b" "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$OUTPUT$" | @MAIL_PROG@ -s "** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
	}


# This is a sample host notification command that can be used to send email 
# notifications (about host alerts) to contacts.

define command{
	command_name	host-notify-by-email
	command_line	/usr/bin/printf "%b" "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $OUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | @MAIL_PROG@ -s "Host $HOSTSTATE$ alert for $HOSTNAME$!" $CONTACTEMAIL$
	}


# Command to check to see if a host is "alive" (up) by pinging it

define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 99,99% -c 100,100% -p 1 
        }


# Generic command to check a device by pinging it

define command{
	command_name	check_ping
	command_line	$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
	}


# Command used to check disk space usage on local partitions

define command{
	command_name	check_local_disk
	command_line	$USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
	}


# Command used to check the number of currently logged in users on the
# local machine

define command{
	command_name	check_local_users
	command_line	$USER1$/check_users -w $ARG1$ -c $ARG2$
	}


# Command to check the number of running processing on the local machine

define command{
	command_name	check_local_procs
	command_line	$USER1$/check_procs -w $ARG1$ -c $ARG2$
	}


# Command to check the load on the local machine

define command{
	command_name	check_local_load
	command_line	$USER1$/check_load -w $ARG1$ -c $ARG2$
	}



###############################################################################
###############################################################################
#
# CONTACTS
#
###############################################################################
###############################################################################

# In this simple config file, a single contact will receive all alerts.
# This assumes that you have an account (or email alias) called
# "@nagios_user@-admin" on the local host.

define contact{
        contact_name                    chrisb
        alias                           Chris Burgess 
        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options    w,u,c,r
        host_notification_options       d,r
        service_notification_commands   notify-by-email
        host_notification_commands      host-notify-by-email
        email                           chris.burgess@nagiosbook.org
        }



###############################################################################
###############################################################################
#
# CONTACT GROUPS
#
###############################################################################
###############################################################################

# We only have one contact in this simple configuration file, so there is
# no need to create more than one contact group.

define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 chrisb
        }



###############################################################################
###############################################################################
#
# HOSTS
#
###############################################################################
###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1       ; Host notifications are enabled
        event_handler_enabled           1       ; Host event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
        register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }


# Since this is a simple configuration file, we only monitor one host - the
# local host (this machine).

define host{
        use                     generic-host            ; Name of host template to use
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1
        check_command           check-host-alive
        max_check_attempts      10
        notification_interval   120
        notification_period     24x7
        notification_options    d,r
        contact_groups  admins
        }

define host{
        use                     generic-host            ; Name of host template to use
        host_name               intranet       
        alias                   Intranet Server
        address                 192.168.1.99
        check_command           check-host-alive
        max_check_attempts      10
        notification_interval   120
        notification_period     24x7
        notification_options    d,r
        contact_groups  admins
        }



###############################################################################
###############################################################################
#
# HOST GROUPS
#
###############################################################################
###############################################################################

# We only have one host in our simple config file, so there is no need to
# create more than one hostgroup.

define hostgroup{
        hostgroup_name  test
        alias           Test Servers
        members         localhost,intranet
        }



###############################################################################
###############################################################################
#
# SERVICES
#
###############################################################################
###############################################################################

# Generic service definition template - This is NOT a real service, just a template!

define service{
        name                            generic-service ; The 'name' of this service template
        active_checks_enabled           1       ; Active service checks are enabled
        passive_checks_enabled          1       ; Passive service checks are enabled/accepted
        parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1       ; We should obsess over this service (if necessary)
        check_freshness                 0       ; Default is to NOT check service 'freshness'
        notifications_enabled           1       ; Service notifications are enabled
        event_handler_enabled           1       ; Service event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
        register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }


# Define a service to "ping" the local machine

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       localhost, intranet
        service_description             PING
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_interval           960
        notification_period             24x7
	check_command			check_ping!100.0,20%!500.0,60%
        }


# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       localhost
        service_description             Root Partition
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_interval           960
        notification_period             24x7
	check_command			check_local_disk!20%!10%!/
        }



# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Users
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_interval           960
        notification_period             24x7
	check_command			check_local_users!20!50
        }


# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       localhost
        service_description             Total Processes
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_interval           960
        notification_period             24x7
	check_command			check_local_procs!250!400
        }



# Define a service to check the load on the local machine. 

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       localhost
        service_description             Current Load
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_interval           960
        notification_period             24x7
	check_command			check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
        }



# EOF

cgi.cfg (optional)

Turn off Authentication for the CGI's when getting started. After you can view the web interface and everything is working, then turn Authentication on. This section is totally optional, however it does eliminate authentication problems (which are common for some beginners) during the configuration stage.

Open cgi.cfg and make the following changes:

use_authentication=1

change to

use_authentication=0

Warning

It is a very bad idea to permanently disable authentication. The documentation covers the topic in an incredibly thorough fashion. It is however, worth knowing that this option exists when you're initially configuring Nagios.

resources.cfg

$USER1$=/usr/local/libexec/nagios

You should now be able to run a pre-flight check with success. You should get something like the following.

monitor# /usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg

Nagios 2.0b3
Copyright (c) 1999-2005 Ethan Galstad (www.nagios.org)
Last Modified: 04-03-2005
License: GPL

Reading configuration data...

Running pre-flight check on configuration data...

Checking services...
        Checked 6 services.
Checking hosts...
        Checked 2 hosts.
Checking host groups...
        Checked 1 host groups.
Checking service groups...
        Checked 0 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 8 commands.
Checking time periods...
        Checked 1 time periods.
Checking extended host info definitions...
        Checked 0 extended host info definitions.
Checking extended service info definitions...
        Checked 0 extended service info definitions.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

Now, let's start nagios and see what happens!

monitor# /usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg &
[1] 13274
monitor# 
Nagios 2.0b3
Copyright (c) 1999-2005 Ethan Galstad (www.nagios.org)
Last Modified: 04-03-2005
License: GPL

Nagios 2.0b3 starting... (PID=13274)

Let's run ps and grep to make sure it's running.

monitor# ps -aux | grep nagios
nagios 13274  0.0  0.2  3348  2456  p0  S     4:54PM   0:00.05 /usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg

Tail the nagios log to check for errors.

monitor# tail -f /var/spool/nagios/nagios.log 
[1131555251] Nagios 2.0b3 starting... (PID=13274)
[1131555251] LOG VERSION: 2.0
[1131555261] HOST ALERT: intranet;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.24 ms
[1131555261] HOST NOTIFICATION: chrisb;intranet;UP;host-notify-by-email;PING OK - Packet loss = 0%, RTA = 0.24 ms
[1131555261] SERVICE ALERT: intranet;PING;OK;SOFT;1;PING OK - Packet loss = 0%, RTA = 0.25 ms

All looks good! Now let's visit the URL for the web interface. Fire up Firefox and go to your IP address.

Someone pointed out this isn't necessary, although it was for me. Check the permissions carefully.

chown and chrgp the /var/spool/nagios dir and files

and the plugins...

monitor# chmod 755 *

monitor# chgrp wheel *

monitor# chown root *

More on this soon...