I recently had to configure NimBUS to send alarm upon detecting a specific log entry in /var/log/messages on a Linux system. Because this alarm was supposed to be sent by SMS , I didn’t want it to send more than one message. But since our log file has a timestamp, each entry were we found a match would be handled as a unique alarm, thus sending one message for each log entry where the mach was found.
If the string we were looking for first would appear, it would most likely show up somewhere between 5 to 50 times within an hour.It’s hard to guess, really. But we are looking for a problem that won’t solve itself, and the program checking for this problem will continue to write to the log file upon each encounter with the problem.
The way to solve this kind of problem, where we want to ignore the timestamp, is to understand how NimBUS handle incoming alarms. If it receives the same message two or more times, it would just upper the count, instead of creating a new entry in the alarm window.
Lets say our log file looks like this:
Mar 14 14:55:35 ErrorCheck: Oh noes, error detected in A51
Mar 14 14:57:32 ErrorCheck: Oh noes, error detected in A51
We only want to get one alarm, but with a count of two (actually one), not two alarms which is identical except for the timestamp. First, set up logmon to detect the correct line in the log file using regular expressions. The logmon probe supports both pattern recognition and regular expressions, so make sure to use the right one. Regex starts and end with a forward slash, otherwise it assumes pattern is used.
In this case we can use the following simple regex:
Of course my regex where more advanced since I had to detect other parameters as well, since the output of our program also had to be checked.
Now, with this regex in place, we are at the point where every entry will be treated uniquely. But logmon also give you the possibility to construct your own message, and to define variables. And that is what we have to do.
We can construct variables both by row or column number. Since this is a single line, we will use the column offset. So, let us create the variables:
prog = column number 4
error = column number 10
This is only a simplified view. The logmon probe has a user interface for this. Right click, add new variable (or something like that).
When this is done, add your own message text in the field saying so:
$prog: Error detected in $error
When this is set as the outbound message, NimBUS will count it instead of creating a new entry in the alarm view each time, since the message now is identical. If the error code changes, a new alarm will be sent.
Create your own output message when using NimBUS logmon probe on a log file which has a timestamp.
(This short version was a lot better and could have saved me some time)