Cron: pinging hosts

by Juliet Kemp

I realised a while ago that it would be a useful thing to check, occasionally, that all the machines I'm responsible for are still up. (This helps to minimise those embarrassing "Oh, I didn't know there was anything wrong with it" conversations.).

Thus, the following pretty basic perl script, which I run from /etc/crontab on my own desktop every couple of hours:

#!/usr/bin/perl -w
#
# host_ping.pl - run from crontab

use strict;
use Net::Ping;
use Net::SMTP;

sub sendmail;

my $ping  = Net::Ping->new();
my $email = 'me@example.com';

my @host_array = qw/host1 host2 serverA serverB/;
my $hosts_down = "";

foreach my $host (@host_array) {
    unless ($ping->ping($host)) {
        $hosts_down .= "$host ";
    }
}

sendmail() if ($hosts_down ne "");

sub sendmail()  
{
    # email to me
    my $s = Net::SMTP->new('mailserver.example.com');
    $s->mail($email);
    $s->to($email);
    $s->data("Subject: Host(s) down: $hosts_down","\n","\n");
    $s->quit;
}

Also this week, I've been organising an engineer for a 4TB RAID 5 array which had 2 disks fall over at the same time. Apparently this is increasingly common with large SATA disks (we had 10 500GB disks) - probably due to the heavy load put on the disks by rebuilding. And of course it renders the RAID5 unusable, so reinstall/restore-from-tape fun on the horizon once the engineer currently in the server room has established that it's definitely kaput.

The other current project is looking at Puppet. So far I've got a server and test client working, and am cautiously optimistic about prospective usefulness. I wish you could readily up the log level without having to run in the foreground, mind. I will doubtless blog more on this in future.


10 Comments

banetbi
2007-09-06 11:06:31
Try out the hobbit monitor http://hobbitmon.sourceforge.net/ for better server monitoring that is open source.
nicerobot
2007-09-06 12:12:35
I've found ping to be too limited. I like to know which services are running. Particularly, port 80.


I do this (a lot more than this but this is the gist of it):


#!/bin/sh


for host in host1 host2 serverA serverB; do
if ! curl --max-time 30 --connect-timeout 20 -sL http://${host} >${LOG} 2>&1; then
mail -s "${host}:80 isn't responding" my@email.com -r me@somehost.com
fi
done
exit 0

For other services, maybe

#!/bin/sh


for host in host1 host2 serverA serverB; do
for port in 21 22 25 110 143; do
if ! nmap -P0 -pT:${port} ${host} | grep "${port}/tcp" | grep open >/dev/null 2>&1; then
mail -s "${host}:{$port} isn't open" my@email.com -r me@somehost.com
fi
done
done
exit 0

These can and should be optimized but that's along the lines of how I like to "ping" sites.
Caitlyn Martin
2007-09-06 14:37:12
There's othing wrong with your script, of course, but I agree with the other commenters that ping is generally not enough. I need to know that the server is alive and well but I also need to know that it's doing what it's supposed to. My preferred Open Source monitoring tool is Nagios. I've always run it on a Linux server but I've used it to monitor Solaris, HP-UX, AIX, and even Windows.
Jason L
2007-09-06 19:24:08
You can also use Mon http://mon.wiki.kernel.org/index.php/Main_Page
so you can actually monitor the running service. This is useful if you only have a couple servers to monitors. To many and I would recommend something like above (nagios).


you can even use mon and heartbeat to fail over the server (service) if needed.

St├ęphane Bortzmeyer
2007-09-07 01:30:14
Extremely bad idea to use cron. cron has no memory so it cannot remember if it already mailed you or not. If you run cron every ten minutes and you leave for the night, you'll get dozens of emails!


Better to use a always-running program like mon, mentioned by Jason. mon allows you to specify after how many failures you email the sysadmin, how many alerts does it send, etc.

Juliet Kemp
2007-09-07 05:30:57
Thanks for the comments! Most of my machines are desktops rather than servers; but I probably should be checking ssh (the only service run on desktops) as well as just pinging.


I do have other things in place for servers, but thanks for the hobbitmon & mon suggestions - I hadn't seen those & will check them out.


Stephane - I take your point, but in practice this isn't an issue for me. I am going to have a look at mon or whatever, though - may be a better way of doing this!

Jeroen
2007-09-08 06:57:24
Wouldn't a thing like cacti, zenoss or nagios be a solution for you? You can let it auto-report to a mailadres when a host is down or even when a printer is out of toner...
Manny
2007-09-11 14:37:38
Nice! Although, we use Nagios here, this still comes in handy if you have some smaller systems to monitor. I like the Perl script, though! I'm sure gets the job done just fine!
Saint Aardvark
2007-09-14 11:10:18
One more vote for Nagios. Each time I set it up, it takes me about half an hour to wrap my head around hosts/hostgroups/services, and then I can start adding stuff. Once you get one or two hosts/services in there, it's trivial to add more -- cut-and-paste for the most part. And pointing it at desktop machines is always nice...as you say, it prevents the "oh, everything's down?" moment. :-)
Juliet Kemp
2007-09-20 06:25:01
Thanks for the nagios suggestions - will take a look at that as well.