Вы находитесь на странице: 1из 13

4: Nagios

Nagios (Figure D) is considered by many to be the king of open source network monitoring
systems. Although not the easiest tool to set up and configure (you have to manually edit
configuration files), Nagios is incredibly powerful. And even though the idea of manual
configuration might turn some off, that setup actually makes Nagios one of the most flexible
network monitors around. In the end, the vast number of features Nagios offers is simply
unmatched. You can even set up email, SMS, and printed paper alerts!

Figure D

For monitoring statistics (memory usage, load, mysql activity, apache activity, etc.) I use Munin. Out
of the box it already tracks a lot of things and plots graphs for different time intervals (last 24 hours,
last 7 days, last month, last year). Through plugins even more things can be monitored. It's output
are HTML pages with pretty graphs.
Munin has a master/node architecture: nodes gather statistics on a server and the master stores the
data and produces HTML and graphs.
I use Monit to keep track of running processes and to restart or alert me when certain configureable
conditions arise (high cpu load, high memory usage, no HTTP response, etc.) Monit can also
monitor more general things about a server, such as cpu load, memory usage, harddisk status or
disk usage.
Monit needs to be configured for every service or hardware you want to monitor and how to respond
when something goes wrong. The most used options are to do nothing, send an alert email or restart
the service.

Monit is great when it works, but sometimes it fails to start, stop or restart a service and there is not
a lot of diagnostic information available to tell you what went wrong. This means you don't know if
the problem was with your service or with the Monit configuration, which runs with a cron-like
minimal environment.
Both tools are available by default on most Linux distributions.
It all depends what you mean by "monitor"!

Is it (system or service) available? We use nagios.


What is it doing? We use munin for linux servers, and cacti for just about everything else,
even though it is a pain to configure sometimes...

What has it done? We use syslog-ng to concentrate syslogs in one place and then run a
customized logcheck script daily to send reports via email. We are looking for something similar
for Windows servers.

Nagios is great since it's free and there is plenty of plugin's for it. However the UI and config
is very difficult.

accepte

It's exact opposite in pro's/con's which is also great is Microsoft System Centre Operations
Manager (SCOM) which is not free, has less plugin's but setup and config are brilliant and
easy.

I must admit if I was in a primarily Microsoft company, had very high reliance requirements
(i.e. can't afford for monitoring to break) or had to think about getting developers to work with
it then SCOM would be my recommendation over Nagios.

I've used Nagios in the past with success. It's very extensible (over 200 add-ons), relatively easy to
use and lots of reports. A negative would be the initial setup.
share

answered Apr 30 '09 at 8:25

community wiki
jdiaz

Nagios works great to monitor all types of host (Windows, Linux, Routers, Switches, etc.) I recommend using a
configuration tool like fruity or Lilacto ease the configuration pain. NSClient++ on the windows boxes and nagios10 statd on the linux stuff to monitor running processes, disk usage, etc. TonyB May 1 '09 at 23:27

eatures
Semonto is a Server Monitoring Tool. That means we will monitor your server
24/7 and alert you instantly when a server problem occurs!

Warnings by E-mail, SMS, Apple Push Notification or Twitter


Semonto will email you the exact warning or error message when a check fails. If you have
an iPhone, you can download our free iPhone-client and receive Apple Push Notifications. For
those who haven't, we offer the possibility to be alerted via SMS or even Twitter.

Multiple Test Servers Around the World


Semonto has multiple test servers around the globe. That way you can choose which server
needs to check your server. It is also possible to select a different server for every host, so
you can test the connectivity to your host from several countries.

More Than 15 Default Tests Available


Semonto offers more than 15 default tests which include: a ping test, packet-loss, serverload, MySQL-tests, port-tests and many more. The default tests will cover the basic needs for
testing hosts.

Create Your Own Tests


Furthermore we offer the possibility of customizing your own tests. You will be able to create
tests for almost anything, e.g. for monitoring a buffer queue, the network load or the
temperature of your server. Implementations for certain tests like the RAM-usage and more
can be found on our blog and on the help pages.

Status Of Your Servers


We provide a web-page where the status of your server is shown. You can share this page
and the status of your servers with others, without them being able to access Semonto. It is
also possible to make a custom URL (e.g. http://myserver.semonto.com), to protect the
status with a password, to show multiple servers on one single URL etc.

Sample screens

History of performed tests


Semonto keeps a log of all the performed tests. The results of those tests can be viewed at
the history center. You can check the values every year, month, day or hour and even view
the data in a graph to easily spot patterns or long-term evolutions. Every server has a
logbook. This logbook contains every alert-message ever sent, along with detailed
information.

Sample screens

PulseCheck
PulseCheck is another feature of Semonto and can be seen as your server's pacemaker. In
same cases, we are unable to connect directly to your server, and will have to work the
other way around. Your server will need to send out pulses to Semonto by calling our
webservice. If we doesn't receive a pulse in time, you will be alerted. Just like a pacemaker
would if your heart stopped beating.
PulseCheck is useful for monitoring periodical background jobs, but also for servers running
behind firewalls or servers that don't run a web service.

Monthly uptime report


Every month Semonto will send you a report of your server's uptime statistics, including the
uptime percentages for every service. You will be able to view encountered problems in a
daily calender, and in an overview table.

Sample screens

Software as a Service
Semonto is what is called a SAAS, Software As A Service. You won't have to install any
software nor does anything have to run in the background. We look after the software and
make sure to keep it up to date. We also introduce new features, increase performances,
make backups and more. Although the software isn't installed on your server/network, we
are still able to send out alerts, even when your network is offline!

Support for timezones


Semonto supports timezones. All reports will be in your local timezone.

User Defined Monitoring and Instant Network Alerts

Monitor the Windows Event log

Alert on hardware and software changes

Alert on specific file changes and protection violations

Know if disk space is running low on computers

Monitor computer online/offline status

Know if a server goes down

Know when traveling users with notebooks connect

Alert message and recipient configuration

With Nagios you can:


Monitor your entire IT infrastructure
Spot problems before they occur

Know immediately when problems arise

Share availability data with stakeholders

Detect security breaches

Plan and budget for IT upgrades

Reduce downtime and business losses

Nagios Features

| Print |

Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure
problems before they affect critical business processes.

Email

Features
Comprehensive Monitoring
Capabilities to monitor applications, services, operating systems, network protocols, system metrics
and infrastructure components with a single tool
Powerful script APIs allow easy monitoring of in-house and custom applications, services, and systems

Visibility
Centralized view of entire monitored IT infrastructure
Detailed status information available through web interface

Awareness
Fast detection of infrastructure outages
Alerts can be delivered to technical staff via email or SMS
Escalation capabilities ensure alert notifications reach the right people

Problem Remediation
Alert acknowledgments provide communication on known issues and problem response
Event handlers allow automatic restart of failed applications and services

Proactive Planning
Trending and capacity planning addons ensure you're aware of aging infrastructure
Scheduled downtime allows for alert suppression during infrastructure upgrades

Reporting
Availability reports ensure SLAs are being met
Historical reports provide record of alerts, notifications, outages, and alert response
Third-party addons extend reporting capabilities

Multi-Tenant Capabilities
Multi-user access to web interface allows stake holders to view infrastructure status
User-specific views ensures clients see only their infrastructure components

Extendable Architecture
Integration with in-house and third-party applications is easy with multiple APIs
Hundreds of community-developed addons extend core Nagios functionality

Stable, Reliable, and Respected Platform


Over 10 years of active development
Scales to monitor thousands of nodes
Failover capabilities ensure non-stop monitoring of critical IT infrastructure components
Multiple awards, media coverage and recognition prove Nagios' value

Vibrant Community
An estimated 1 million+ users worldwide

Active community mailing lists provide free support


Hundreds of community-developed addons extend Nagios' core functionality

Customizable Code
Open Source Software
Full access to source code

Released under the GPL license

NETCOOL

Near real-time event management to improve


availability and resiliency

IBM Tivoli Netcool/OMNIbus delivers near real-time, consolidated


event management across business infrastructure, data centers, complex
networks and IT domains. The software provides full management and
automation to help you deliver continuous uptime of business services and
applications.
Tivoli Netcool/OMNIbus provides near real-time service assurance for
business infrastructure, applications, servers, network devices and
protocols, Internet protocols, storage and security devices.
Helps increase efficiency and speed problem resolution by
consolidating network and IT operations in a single management solution.
Combines scalability with a flexible architecture to help scale from
small to large environments, with more than a 100 million events per day
across multiple networks, IT silos and geographies.
Helps enable quick resolution by allowing operators to run automated
resolution scripts against recurring, predictable problems.
Helps address the most critical problems, and automates isolation
and resolution by using customizable lightweight agents to collect business
and technology events in near real time.

Integrates with application performance management solutions so


you can proactively measure user experiences and performance across
applications.

Service quality and performance management


for IP and wireline networks

IBM Netcool/Proviso provides service quality and performance


management for Internet Protocol (IP) and wireline networks. It reports on
network performance and usage to help avoid, detect and analyze service
issues throughout an organization. It offers carrier-class scalability and
flexibility to deploy next-generation network services and technologies
more quickly.
Netcool/Proviso features:
Network performance management tools that monitor usage and
service quality and help reduce operating and capital costs.
Service quality metrics that uncover usage issues to help improve
customer satisfaction.
Flexible operations and customer reports that deliver network
performance data.
A single, automated platform that streamlines performance
management of disparate IP and wireline networks.
Integrated service management that supports leading network
technologies and extends performance management capabilities.

TEC

Monitoring events on Tivoli Enterprise Console


You can use the Tivoli Enterprise Console to monitor events.
From the IBM Tivoli Enterprise Console, you can:

Find similar events and more tasks information, refer to the Tasks section of the IBM Tivoli
Enterprise Console Command and Task
Referenceat http://publib.boulder.ibm.com/tividd/td/tec/SC32-123200/en_US/HTML/ecormsttfrm.htm.
Sort events by hostname.
To sort events by hostname, do the following:

1. From the Tivoli Enterprise Console - Summary Chart View window, under the Windows menu,
select Configuration.
Figure 1. Tivoli Enterprise Console Summary Chart View window

2. From the Tivoli Enterprise Console - Configuration window, right-click Event Groups and
select Create Event Group.
Figure 2. Tivoli Enterprise Console Create Event Group option

3. Enter a name (for example, elake) and description for your new Event Group on the Event Group
Properties window.

Figure 3. Tivoli Enterprise Console Event Group Properties window

4. From the Tivoli Enterprise Console - Configuration window, right-click Event Groups,
select elake and then select Create Filter.
Figure 4. Tivoli Enterprise Console Create Filter option

5. From Add Event Group Filter, enter a name and click Edit Constraint.
6. From Edit Constraint, under Attribute select Hostname, under Operator select Equal to(=), and
enter the hostname (for example: elake) forValue.

Figure 5. Tivoli Enterprise Console Edit Constraint window

7.
8.
9.
10.
11.
12.
13.

Click OK to add the constraint.


Click OK again to close the Add Event Group Filter window.
Right-click Console>Default>Event Groups and select Assign Event Groups.
From Console Properties, click Assign Groups.
Select elake to add it to the console and click OK.
Click OK to close the Console Properties window.
From the Tivoli Enterprise Console - Summary Chart View window, click elake bar to view the
events.

Figure 6. Event Viewer: Group elake window

Вам также может понравиться