Logfile Check on Linux

In Operations Manager 2007 R2 we have the possibility to monitor Linux and UNIX machines. There are among with other new features two new management pack templates:

  • Unix/Linux LogFile (monitor a logfile for a specified entry)
  • Unix/Linux Service (monitor a service with a standalone process)

In this post I will show some ideas how to monitor file size on a linux machine. File size monitoring is not a default feature in R2, not on Windows or on Linux machines. On Windows machines I use a two state monitor and a script, describe in this post.

The first step is to create a script on the Linux side. This script checks how big the file is, and if the file is bigger then 100 it will write a warning to a logfile (scriptlog.log).

#!/bin/sh
find /load.sh -printf ‘%s %p\n’ | while read size name; do
if [ “$size” -gt 100 ]; then
echo $(date) WARNING the file is $size >> scriptlog.log
fi
done

The next step is to get Linux to run it automatically, we can do that with cron. Cron is a time-based job scheduler in Linux. Cron is driven by a crontab, a configuration file that specifies what to run and when. My crontab looks like

* * * * * / root/script.sh

It is very simple, I run the script every minute. Configure it with

crontab -e

The next step is to configure a management pack template for the Linux logfile to trigger on WARNING in the scriptlog.log file, configure it to trigger on WARNING. It is also important to keep track of the cron process, fortunately that is monitored with the default SUSE management pack.

You are now monitoring if there is a problem with the file size. The next step is to get the size of the file as performance data in Operations Manager. This can also be done with a script and a collection rule. Create a Collection Rule (Probe Based/Script (Performance)) and run the following script with the rule:

Set objShell = WScript.CreateObject(“WScript.Shell”)
Set objExecObject = objShell.Exec(“cmd /c C:\plink.exe user@192.168.0.71 -pw password stat -c%s / root/script.sh”)
Do While Not objExecObject.StdOut.AtEndOfStream
strText = objExecObject.StdOut.ReadLine()

Dim oAPI,oBAG
Set oAPI = CreateObject(“MOM.ScriptAPI”)
Set oBag = oAPI.CreatePropertyBag()
Call oBag.AddValue(“PerfValue”, 10)
Call oAPI.Return(oBag)
Loop

This script runs plink.exe. Plink (PuTTY Link) is a command-line connection tool. We will use that to execute commands on the Linux side. The script will then collect the result of the command, the file size, and send it back as a performance data value (PerfValue). I have the same kind of script for Windows here.

The next thing we might want to check is if the file exists. We can do that with a two state monitor. In this post you can read how to configure a two state monitor with a script. Use the script below in your monitor

Dim oAPI, oBag
Set oAPI = CreateObject(“MOM.ScriptAPI”)
Set oBag = oAPI.CreatePropertyBag()
Set objShell = WScript.CreateObject(“WScript.Shell”)
Set objExecObject = objShell.Exec(“cmd /c C:\plink.exe user@192.168.0.71 -pw password [ -f / root/thefile.log ] && echo ok || echo bad”)
Do While Not objExecObject.StdOut.AtEndOfStream
strValue = objExecObject.StdOut.ReadLine()

If instr(strValue, “ok”) Then
Call oBag.AddValue(“Status”,”OK”)
Call oAPI.Return(oBag)
End If

If instr(strValue, “bad”) Then
Call oBag.AddValue(“Status”,”Bad”)
Call oAPI.Return(oBag)
End If

Loop

That script checks if thefile.log exists in the root directory. If it does it will send back “ok” else “bad”.

Summary: We use a couple of different scripts and forwards the result to Ops Mgr. One script echo to a logfile that we then pickup with default a Logfile management pack template. Another script is run from inside a two state monitor with the plink.exe tool. In this post I wanted to give you some ideas to get info into Operations Manager 2007 from your Linux machines.

About

Microsoft

9 thoughts on “Logfile Check on Linux

  1. I was just reading your blog post about custom log file monitoring on Linxu which was great but it is not very helpful for my new conundrum. Would LOVE to see you blog more about cross-platform issues!

    Here is the situation in a nutshell- if you had some recommendations about who excels in this space it would be appreciated, specifically maybe some base scripts available that we could grab for customizations.

    In a nutshell, the “powers that be” want to deploy a .cfg file to all systems. This file will specify a service level (bronze, silver, gold, platinum) and a filesystem that needs this level of service (aka, this service level will need to be correlated to special warning and crit levels for alerting different from the default). So we need dynamic groups that can pick up the output of this file for both filesystem and service-level thresholds. Keep in mind, there could be identical file systems that would have different service levels so I can’t use this as my criteria.

    So dynamic groups- there is no attribute of the logical disk that would identify this. In the authoring console, to create an additional attribute for the say RHEL 5 Logical Disk, I can only define these by two discovery methods. The wizard only lets you use WMI query or registry, neither of which are helpful for xplat. I need to have the ability to add an attribute based on the script values. So this must first have a script and then the output of this script must be able to be used for discovery so I can create an attribute for this.

    Any thoughts?

  2. Hi, I think I would try to keep the monitoring part in OPerations Manager. Else you will need to do it with a ssh command in Opalis, if you want to use only default activities.

  3. Hi all, if you have to monitoring log file in linux machine that Opeation Manager not manage? (ex. Xenserver). I think that Opalis is the choice! But how to configure workflow to monitoring log file with Opalis?
    Thank you

    S.

  4. I tested both ways today, but still no good result.
    – Re-writing the file with a new error line: I’m still waiting for the alert (already 15 minutes)
    – Adding error lines: result like I mentioned before.

    Monitoring of a Linux log file is not reliable for the moment.

  5. That I did not notice, but I notice that there is different ways to insert text in a logfile on the Linux side. One way re-writes the file and the other way adds text. I notice that the result was the same, but Ops Mgr responds different to the two ways.

  6. Hi,

    did you notice that the UNIX/Linux logfile monitoring only checks every 5 minutes and loses everything between?
    I added a crontab entry that echos an error into the monitored logfile.
    The 1st entry was detected and after 5 minutes the 6th entry, the 4 in between were not.

    Or did I configure it wrong?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.