oreilly.comSafari Books Online.Conferences.


A New Visualization for Web Server Logs
Pages: 1, 2


If the plot is too dense--as was the case for me--thin it down by telling Gnuplot to only use every nth data point. For example, I thinned Figure 1 by plotting every tenth point with the Gnuplot splot command:

splot "gnuplot.input" using 1:2:3 every 10

Figure 3 shows the corresponding scatter plot.

Thinned 3D scatter plot of a good day
Figure 3. Thinned scatter plot

Gnuplot makes it easy to focus on a part of the plot by setting the axes ranges. Figure 4 shows a small part of the Y- and Z-axes. The almost continuous lines that run parallel to the time axis are monitoring probes that regularly request the same page. Four of them should be clearly visible. In addition, I changed the eye position.

Monitoring probes visible after reducing the Y and Z ranges.
Figure 4. Reduced Y and Z ranges showing monitoring probes

Because real people need sleep, it should be possible to make out the diurnal rhythms that rule our lives. This is evident in Figure 4. The requests are denser from 08:00 to about 17:00 and quite sparse in the early hours of the morning.

Changing the viewing angle can give you a new point of view. Gnuplot lets you do it in one of two ways: with the command line set view or interactively with a click and drag of the mouse.

The Pièce de Résistance

Because a display of 3D plots is difficult to see in three dimensions without stereoscopic glasses, I used a few more manipulations to "jitter" the image such that the depth in the picture is visible. The plot in Figure 5 is an example of this. It was easy to generate with more Gnuplot commands followed by GIF animation with ImageMagick.

An animated scatter plot
Figure 5. A animated GIF of the scatter plot that hints at the 3D structure

Further Work

With Gnuplot 4.2, which is still in beta, it is now possible to draw scatter plots in glorious color. Initial tests show that using color for the status code dimension makes the plots even more informative. Stay tuned.


Though the 3D plots present no hard numbers or trend lines, the scatter plot as described and illustrated above may give a more intuitive view of web server requests. Especially when diagnosing problems, this alternative way of presenting logfile data can be more useful than the charts and reports of a standard log analyzer tool.

Code Listings

The Perl script:

# convert access log files to gnuplot input
# Raju Varghese. 2007-02-03

use strict;

my $tempFilename    = "/tmp/temp.dat";
my $ipListFilename  = "/tmp/iplist.dat";
my $urlListFilename = "/tmp/urllist.dat";

my (%ipList, %urlList);

sub ip2int {
        my ($ip) = @_;
        my @ipOctet = split (/\./, $ip);
        my $n = 0;
        foreach (@ipOctet) {
                $n = $n*256 + $_;
        return $n;

# prepare temp file to store log lines temporarily
open (TEMP, ">$tempFilename");

# reads log lines from stdin or files specified on command line

while (<>) {
        my ($ip, undef, undef, $time, undef, undef, $url, undef) = split;
        $time =~ s/\[//;
        next if ($url =~ /(gif|jpg|png|js|css)$/);
        print TEMP "$time $ip $url $sc\n";

# process IP addresses

my @sortedIpList = sort {ip2int($a) <=> ip2int($b)} keys %ipList;
my $n = 0;
open (IPLIST, ">$ipListFilename");
foreach (@sortedIpList) {
        print IPLIST "$n $ipList{$_} $_\n";
        $ipList{$_} = $n;
close (IPLIST);

# process URLs

my @sortedUrlList = sort {$urlList {$b} <=> $urlList {$a}} keys %urlList; 
$n = 0;
open (URLLIST, ">$urlListFilename");
foreach (@sortedUrlList) {
        print URLLIST "$n $urlList{$_} $_\n";
        $urlList{$_} = $n;
close (URLLIST);

close (TEMP); open (TEMP, $tempFilename);
while (<TEMP>) {
        my ($time, $ip, $url, $sc) = split;
        print "$time $ipList{$ip} $urlList{$url} $sc\n";
close (TEMP);

Raju Varghese has a Bachelors in Electrical Engineering from BITS, Pilani (India) and a Masters in Computer Science from the University of Texas, San Antonio.

Return to

Sponsored by: