We often find the need to monitor one or more servers for a wide assortment of information. This article is written to describe a simple single server solution that gets monitoring up and running as quickly as possible, without using all the resources on your server. Typically this would not include monitoring of anything outside the same server. And we choose solutions based on ease of use.
Most of this guide is based on Debian/Ubuntu systems, but it should be easily adaptable to Centos as well.
A simple requirements list...
1. Email for alerts
2. Logging on the server
3. Schedule-able monitoring service (easy to use and configure, low overhead)
First up the email service. We recommend Postfix for getting mail running. It should be installed by default on a new installs. If it is not or if you still need to get mail working on your server, check our Mail setup guide for comprehensive instructions.
In order to provide helpful ongoing logs of system information, we can use the Sysstat package. You can use that when you receive an alert from Monit, to help diagnose the cause. Run apt-get install sysstat
to install this package. That will provide a reasonable default setup for you. You may need to enable it by editing /etc/default/sysstat and changing the line that says 'ENABLED="false"' to 'ENABLED="true"'.
Compared to other service monitors we have looked at, Monit has great documentation as well. And the configuration files are designed to be very readable, making them easier to understand.
If it's not already installed run apt-get install Monit
. If you are running Centos you can install a recent version of monit from EPEL. (yum install monit) Both installations put the main configuration file at /etc/monit.conf
On Debian/Ubuntu systems, edit /etc/defaults/monit so that the line "startup=0" says "startup=1" so that monit will actually run. And restart the Monit service (e.g. with "/etc/init.d/monit restart").
Before changing the configuration, backup the default configuration file in case you want to refer to that later. Assuming you don't already have Monit set up to do anything yet, replace that with the following...
mv /etc/monit.conf /etc/monit.conf.example
Create the monit config file (/etc/monit.conf) with the below contents. You may wish to fiddle the resource limits described.
# Monit control file
# ==================
# You can find the latest version of the monit manual at
# http://www.tildeslash.com/monit/doc/manual.php
# Monit global settings:
# ----------------------
set daemon 120
set logfile syslog
set mailserver 127.0.0.1
# you can add more alert lines here to include more email addresses
set alert root@localhost
# uncomment below lines to enable a web interface on "http://<your main ip>:2812"
#set httpd port 2812 and
# allow admin:changeme # require user 'admin' with password 'changeme'
# Simple resource monitoring
# --------
# You may globally manage these from the command line without affecting
# other monitoring with the following command...
# "monit -g resources [stop|start|restart]"
# check against RAM and RAM + SWAP.
check system Memory
alert root@localhost on { resource } with reminder on 10 cycles
group resources
if memory usage > 80% for 5 cycles then alert
else if passed within 5 cycles then alert
if loadavg (15min) is greater than 0.95 for 2 cycles then alert
else if passed within 5 cycles then alert
# check against disk usage. may need to be /dev/xvda1 on some file systems.
check device Root-filesystem with path /dev/root
alert root@localhost on { resource } with reminder on 10 cycles
group resources
if space usage is greater than 95% for 5 cycles then alert
else if passed within 5 cycles then alert
#
# Add configuration parts from other files or directories. The following is
# the recommended location for such files, but you can include those from anywhere.
#
#include /etc/monit.d/*
Restart monit, and set it to start on boot ...
/etc/init.d/monit restart
update-rc.d monit defaults
In /usr/share/doc/monit/README.Debian, you can find some information about creating a monit_delay script to prevent Monit from restarting services that are slow to come up on reboot.
Its actually reasonable to have limits (e.g. for memory tests) set quite low, since the server will need some breathing room to properly complete any follow-up tasks.
To enable the Monit web interface, uncomment the two relevant lines near the top of the example above, and then restart monit. I recommend you change the password.
If you want Monit to keep an eye on services for you, enable a subdirectory that configuration files can be included from. In the example that just needs to be uncommented (the "Include" directive). Then you would create one file for each service. You can use multiple include directives, although Monit will complain if they address the same service. This makes administration easier than when looking through one long file.
The following are some simple guides or 'recipes' to check in various services. If you have constructive improvements or alternatives we'd love to hear from you.
We are assuming you will create each service monitor in a separate file and use the Include directive mentioned above.
You can find many other configuration examples for popular applications, just run a search.
To test Apache's status, it's standard practice to create an empty file or 'token' for Monit to request.
On Debian
mkdir /\var/www/monit/
touch /\var/www/monit/token
Paste this in your Apache configuration:
SetEnvIf Request_URI \"^\/monit\/token$\" dontlog
CustomLog logs/access.log common env=!dontlog
Then paste this in /etc/monit.d/apache.conf
check process apache
with pidfile "/var/run/httpd.pid"
start program = "/etc/init.d/apache2 start"
stop program = \"/etc/init.d/apache2 stop"
if failed host 127.0.0.1 port 80
protocol HTTP request /monit/token then restart
if 5 restarts within 5 cycles then timeout
Startup monit with /etc/init.d/monit restart
and check that it's running.
ps aux | grep monit
root 11583 0.0 0.7 20400 1248 ? Sl 04:20 0:00
/usr/sbin/monit -d 180 -c /etc/monit/monitrc -s /var/lib/monit/monit.state
Then to test that works, lets stop apache:
/etc/init.d/apache2 stop
In syslog you'll see something like this:
Jun 18 04:32:41 rimu monit[11583]: HTTP error: Server returned
status 404
Jun 18 04:32:41 rimu monit[11583]: 'apache2' failed protocol test [HTTP]
at INET[127.0.0.1:80] via TCP
Jun 18 04:32:41 rimu monit[11583]: 'apache2' trying to restart
Jun 18 04:32:41 rimu monit[11583]: 'apache2' stop: /etc/init.d/apache2
Jun 18 04:32:42 rimu monit[11583]: 'apache2' start: /etc/init.d/apache2
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
if failed port 80 protocol HTTP request / then restart
if 5 restarts with 5 cycles then timeout
http://www.igvita.com/blog/2006/11/07/monit-makes-mongrel-play-nice/
http://software.pmade.com/blogs/ramblings/2006/12/27/mongrel-cluster-and-monit
http://rubyforge.org/pipermail/mongrel-users/2006-September/001359.html
check host arbitrary_name with address real_url
if failed port 80 proto http then alert
See the monit docs or man page for more examples.
You will likely want to customise the above configurations to better suit your needs. The nice thing about the Monit and Sysstat is that they are relatively simple to use and pretty hard to mess up to the point where you could cause issues for your server. Mostly if you make a mistake with the configuration they just wont run. Remember to make backups of your working config files so you don't loose that data :).
Postfix is a little different though, if you have your server online on the public internet, then I recommend you read the documentation (http://www.postfix.org/documentation.html) to insure you don't invite spam to your server. Simple precautions should help you proceed hassle free.
If you are interested in a simple scripted technique with bash, check out http://www.voluntary-simplicity.org/linux/node/22