Академический Документы
Профессиональный Документы
Культура Документы
Nagios
IT infrastructure monitoring tool
ILG
Insight GNU/Linux Group
Reinventing the way you,
Think,
Learn,
Work
NAGIOS
MODULE - 2
www.gnugroup.org
Index of module - 2
External Commands
NagiosQL
Event Handlers
Security Considerations
Volatile Services
Freshness Checks
State Stalking
Flapping
Using Templates
Oject Inheritance
Clustering - Distributed
Monitoring
Redundant and Failover
Network Monitoring www.gnugroup.org
External Commands
www.gnugroup.org
External Commands
Enabling External Commands
In order to have Nagios process external commands, make sure you do the following:
Specify the location of the command file with the command_file option.
Setup proper permissions on the directory containing the external command file, as described in the quickstart
guide.
At regular intervals specified by the command_check_interval option in the main configuration file
Immediately after event handlers are executed. This is in addtion to the regular cycle of external command checks
and is done to provide immediate action if an event handler submits commands to Nagios.
External commands can be used to accomplish a variety of things while Nagios is running.
Example of what can be done include temporarily disabling notifications for services and hosts, temporarily
disabling service checks, forcing immediate service checks, adding comments to hosts and services, etc.
www.gnugroup.org
Event Handlers
Event handlers are optional system commands (scripts or executables) that are run whenever a
host or service state change occurs.
An obvious use for event handlers is the ability for Nagios to proactively fix problems before anyone
is notified. Some other uses for event handlers include:
Etc
www.gnugroup.org
Event Handlers
www.gnugroup.org
Event Handlers
Example of Event Handlers
Host file directive
define service{
use
local-service
host_name
localhost
service_description
check_command
daemons
check_nrpe!check_daemons
event_handler
restart-services
}
In your commands.cfg file, make sure you have event_handler defined something like:
define command{
command_name
command_line
restart-services
/usr/local/nagios/libexec/eventhandlers/restart-services \
ALL=(ALL)
www.gnugroup.org
NOPASSWD: NAGIOSCOMMANDS
Volatile Services
Nagios has the ability to distinguish between "normal" services and "volatile"
services.
The is_volatile option in each service definition allows you to specify whether a
specific service is volatile or not.
For most people, the majority of all monitored services will be non-volatile (i.e.
"normal"). However, volatile services can be very useful when used properly...
What Are They Useful For?
Volatile services are useful for monitoring...
Things that automatically reset themselves to an "OK" state each time they
are checked
Events such as security alerts which require attention every time there is a
problem (and not just the first time)
www.gnugroup.org
Freshness Check
Nagios supports a feature that does "freshness" checking on the
results of host and service checks.
The purpose of freshness checking is to ensure that host and
service checks are being provided passively by external
applications on a regular basis.
Freshness checking is useful when you want to ensure that passive
checks are being received as frequently as you want.
This can be very useful in distributed and failover monitoring
environments.
www.gnugroup.org
10
Freshness Check
How Does Freshness Checking Work?
www.gnugroup.org
11
Freshness Check
Enabling Freshness Checking
Enable freshness checking on a host- and service-specific basis by setting the check_freshness option
in your host and service definitions to a value of 1.
Configure freshness thresholds by setting the freshness_threshold option in your host and service
definitions.
Configure the check_command option in your host or service definitions to reflect a valid command
that should be used to actively check the host or service when it is detected as stale.
The check_period option in your host and service definitions is used when Nagios determines when a
host or service can be checked for freshness, so make sure it is set to a valid timeperiod.
www.gnugroup.org
12
Freshness Check
An example of a service that might require freshness checking might be
one that reports the status of your nightly backup jobs.
Perhaps you have a external script that submit the results of the backup
job to Nagios once the backup is completed.
In this case, all of the checks/results for the service are provided by an
external application using passive checks.
In order to ensure that the status of the backup job gets reported every
day, you may want to enable freshness checking for the service.
If the external script doesnt submit the results of the backup job, you can
have Nagios fake a critical result by doing something like this.
www.gnugroup.org
13
Freshness Check
For example, the following service definition will accept passive checks but will report an error if they are not
present:
define service
{
Use generic-service
host_name linuxbox02
service_description SSH
check_command no-passive-check-results
check_freshness 1
freshness_threshold 43200
active_checks_enabled 1
passive_checks_enabled 1
The freshness_threshold option specifies the number of seconds after which an active check should be
performed. In this case, it is set to 12 hours.
www.gnugroup.org
14
Freshness Check
It is also necessary to define a command that will be run if no
passive check results have been provided.
The following command will use the check_dummy plugin to report an error:
define command
{
command_name no-passive-check-results
command_line $USER1$/check_dummy 2 "No passive check
results"
}
www.gnugroup.org
15
Freshness Check
Enabling Freshness Checking
Enable freshness checking on a host- and service-specific basis by setting the check_freshness option
in your host and service definitions to a value of 1.
Configure freshness thresholds by setting the freshness_threshold option in your host and service
definitions.
Configure the check_command option in your host or service definitions to reflect a valid command
that should be used to actively check the host or service when it is detected as stale.
The check_period option in your host and service definitions is used when Nagios determines when a
host or service can be checked for freshness, so make sure it is set to a valid timeperiod.
www.gnugroup.org
16
State Stalking
www.gnugroup.org
17
State Stalking
www.gnugroup.org
18
State Stalking
www.gnugroup.org
19
State Stalking
Why is this? With state stalking enabled, Nagios would have examined the
output from each service check to see if it differed from the output of the
previous check.
If the output differed and the state of the service didnt change between the
two checks, the result of the newer service check would get logged.
The decision to to enable state stalking for a particular host or service will also
depend on the plugin that you use to check that host or service.
If the plugin always returns the same text output for a particular state, there is
no reason to enable stalking for that state.
You can enable state stalking for hosts and services by using the
stalking_options directive in host and service definitions.
Volatile services are similar, but will cause notifications and event handlers
to run. Stalking is purely for logging purposes.
www.gnugroup.org
20
Flapping
www.gnugroup.org
21
Using Templates
www.gnugroup.org
22
Object Inheritance
Basics
There are three variables affecting recursion
and inheritance that are present in all object
definitions..
The first variable is name. Its just a "template"
name that can be referenced in other object
definitions so they can inherit the objects
properties/variables. Template names must be
unique amongst objects of the same type,
define someobjecttype{
object-specific variables ...
name template_name
use name_of_template_to_use
register
[0/1]
}
23
Object Inheritance
Basics
There are three variables affecting recursion
and inheritance that are present in all object
definitions..
The first variable is name. Its just a "template"
name that can be referenced in other object
definitions so they can inherit the objects
properties/variables. Template names must be
unique amongst objects of the same type,
define someobjecttype{
object-specific variables ...
name template_name
use name_of_template_to_use
register
[0/1]
}
24
Object Inheritance
www.gnugroup.org
25
Another great feature that Nagios offers is the ability for third-party software or
other Nagios instances to report information on the status of services or hosts.
This way, Nagios does not need to schedule and run checks by itself, but other
applications can report information asit is available to them.
This means that your applications can send problem reports directly to Nagios
instead of just logging them
Nagios also offers a tool for sending passive check results for hosts and services
over a network. It is called NSCA (Nagios Service Check Acceptor).
It can be used to send results from one Nagios instance to another.
This mechanism includes password protection, along with encryption, to
preventinjection of false results in to Nagios. In this way, NSCA communication sent
over Internet is more secure.
www.gnugroup.org
26
There are also different types of checks including external applications or devices
that want to report information directly to Nagios.
This can be done to gather all critical errors to a single, central place. These types
of checks are called Passive Checks.
For example, when a web application cannot connect to the database, it will let
Nagios know about it immediately.
www.gnugroup.org
27
The first thing that needs to be done in order to use passive checks for your Nagios
setup is to make sure that you have the following options in your main Nagios
configuration file:
accept_passive_service_checks=1
accept_passive_host_checks=1
This makes determining the problem of not processing a passive check much
easier. The following directive allows it:
log_passive_checks=1
www.gnugroup.org
28
Use
10.1.1.45
active_checks_enabled
passive_checks_enabled
}
generic-host
host_name linbox1
Address
define service
0
1
Use
host_name linbox1
service_description PING
active_checks_enabled
passive_checks_enabled 1
ping-template
In this case, Nagios will never perform any active checks on its own and will only
rely on the results that are passed to it.
We can also configure Nagios so that if no new information has been provided
within a certain period of time, it www.gnugroup.org
will use active checks to get the current status of
the host or service by setting the active_checks_enabled option to 1
29
NSCA
www.gnugroup.org
30
NSCA
www.gnugroup.org
31
NSCA
www.gnugroup.org
32
NSCA
www.gnugroup.org
33
NSCA
www.gnugroup.org
34
Clustering
One of the first bottlenecks organizations will run into is performance when
monitoring a large number of hosts and services.
This can occur even earlier if you are using performance handlers on your service
or host checks.
One way to resolve performance problems is to cluster Nagios;
clustering is also very useful when there are a number of remote sites that need to
be monitored by Nagios
Usually, there are one or more Nagios instances that report information to a single
central Nagios instance.
The servers that reports information to another Nagios machine as a slave.
A Nagios instance that receives reports from oneor more slaves will be referred to
as a master.
.
www.gnugroup.org
35
36
Clustering
One of the first bottlenecks organizations will run into is performance when
monitoring a large number of hosts and services.
This can occur even earlier if you are using performance handlers on your service
or host checks.
One way to resolve performance problems is to cluster Nagios;
clustering is also very useful when there are a number of remote sites that need to
be monitored by Nagios
Usually, there are one or more Nagios instances that report information to a single
central Nagios instance.
The servers that reports information to another Nagios machine as a slave.
A Nagios instance that receives reports from oneor more slaves will be referred to
as a master.
.
www.gnugroup.org
37
38
Clustering
www.gnugroup.org
39
Clustering
Data Flow
www.gnugroup.org
40
Clustering
www.gnugroup.org
41
Clustering
notifications.
The nsca_send_result script will then forward the service check results to the
master Nagios server.
www.gnugroup.org
42
Clustering
The nsca_send_result script will then forward the service check results to the
master Nagios server. Place the following definition for nsca_send_result in your
commands
configuration file (commands.cfg by default):
define command{
command_name nsca_send_result
command_line /usr/local/nagios/libexec/nsca_send_result
$HOSTNAME$ $SERVICEDESC$ $SERVICESTATE$
$SERVICEOUTPUT$
}
www.gnugroup.org
43
NagiosQL
Consistency checks
Syntax verification
User management
www.gnugroup.org
44
NagiosQL
NagiosQLs installation requirements
www.gnugroup.org
45
NagiosQL
Open a terminal.
Change to the document root with the command cd /var/www/html.
Unpack the newly downloaded tar file with the command sudo tar xvzf
nagiosql_XXX.tar.gz (XXX is the release number).
Rename the newly created nagiosql32 directory to nagiosql with the command
sudo mv nagiosql32 nagiosql.
www.gnugroup.org
46
NagiosQL
You must run the following commands in order to give NagiosQL the proper permission to
install and run. (Note: This assumes your web server runs under the www-data user name; if it
doesnt, alter the commands to suit your setup.)
www.gnugroup.org
47
NagiosQL
NagiosQL configuration
www.gnugroup.org
48
NagiosQL
www.gnugroup.org
49
NagiosQL
NagiosQL configuration
www.gnugroup.org
50
NagiosQL
Configuration
Directory Structure
/etc/nagiosql/
"
"
/services
"
/backup/
"
"
/hosts
"
"
/services
www.gnugroup.org
51
NagiosQL
www.gnugroup.org
52
NagiosQL
NagiosQL will make sure everything passes muster for the installation. If anything fails, this
screen will give you plenty of information about the problem.
www.gnugroup.org
53
NagiosQL
Configure a database
The installer creates a database for you.
www.gnugroup.org
54
NagiosQL
Log in. You log in to your NagiosQL site by pointing your browser to
http://ADDRESS_TO_SERVER/nagiosql/
www.gnugroup.org
55
NagiosQL
Admin screen
www.gnugroup.org
56
Security Considerations
Best Practices
Nagios doesnt need to run as root, so dont do it. You can tell Nagiosto drop privileges after startup and run as another
user/group by using the nagios_user and nagios_group directives in the main config file. If you need to execute event handlers
or plugins which require root access, you might want to try using sudo.
Make sure that only the nagios user is able to read/write in the check result path. If users other than nagios (or root) are able to
write to this directory, they could send fake host/service check results to the Nagios daemon.
www.gnugroup.org
57
Thanks
&
Questions / Answers??????
www.gnugroup.org
58