Visualising log files with gnuplot

I recently had the pleasure of supporting a new system throughout its first month of production. This was a good opportunity to refresh my command line skills. As it happened I spent a lot of time looking at log files trying to figure out what happened to the productions system. I figured, that a graphical representation of the events would be nice and started using gnuplot.

First I started out with a bunch of bash scripts, using what your usual unix installation provides, but then I actually came up with some groovy scripts to provide better abstractions. A log file generally looks somewhat like this:

04/01/1970 07:55:13 garbage
04/01/1970 09:27:48 Event 2
04/01/1970 10:01:28 garbage
04/01/1970 10:38:30 garbage
04/01/1970 10:48:36 garbage
04/01/1970 10:51:58 Event 2
04/01/1970 11:03:45 garbage
04/01/1970 11:34:03 Event 1
04/01/1970 12:24:33 garbage
05/01/1970 04:35:50 ERROR

There is a lot of garbage plus some events we might be interested in. It allows to specify events, e.g. by providing a regexp:

Event EVENT1 = new RegExEvent("Event 1", ~/Event 1/)
Event EVENT2 = new RegExEvent("Event 2", ~/Event 2/)
Event ERROR = new RegExEvent("Error", ~/ERROR/)

The next step is newing up a TimeLineVisualizer on these events and passing in a stream with the actual log:

def logFile = new File("test.log");
 
logFile.withInputStream {InputStream stream ->
  def visualizer = new TimeLineVisualizer([
          EVENT1,
          EVENT2,
          ERROR
  ]);
  visualizer.visualize(stream)
}

If you have the gnuplot binary on your path this will yield something like this:

timeline

Also in some cases you would like to know which time of day events are most likely to happen. For producing histograms I created another visualizer (which currently takes only one event).

logFile.withInputStream {InputStream stream ->
  def visualizer = new HistogramVisualizer(EVENT2, HistogramVisualizer.HOUR_OF_DAY_BINS)
  visualizer.visualize(stream)
}

For the example log file, which unfortunately has an even distribution of events, we get this:

histogram

The cool thing about gnuplot is, that you can actually run these things in a cron job to produce daily reports (and mail them to the appropriate people) or on a continuous integration server to visualise how the system is being exercised by the test suite.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.