top

Omonitor - Configuration

Updated 2009-05-03 by Oles Hnatkevych

Architecture

Agents send simple messages to servers. Messages contain hostname, testname, data. Messages may contain status, if processing happens on clien-side. General tests happen on server side.

Servers accept messages through xinetd/acceptor or through daemon, run test if needed and put event into database. Server side also runs scritps that monitor URLs, ping hosts, check for events not being updated (cleaner), those scripts put information directly to database through API. After event put to database, email alert processing happens.

Server may forward messages to another servers, in this case messages pass to remote servers unprocessed, so remote server will process them according to its settings. Forwarding is helpful when pinging hosts from remote locations and passing results to another server, which may not be able to ping those hosts directly.

Server contains database, and can present it via web interface.

Windows agent acts like a server itself, running as service, and regularly collecting data from remote stations via DCOM/WMI, and sends unprocessed data to server.

Web-interface and event colors

It should be obvious to anyone. The pages you defined in etc/pages file show you status of the tests you selected. History mode will show you non-green events and a history of events.

There's a concept of 'disabling' test. When it is disabled, it does not change the color/status of the whole page, so it's like confirming that you know the problem, and someone works on it. When the test goes green again, it will be enabled automatically, so next time it goes red, it will change the status of the whole page.

Event colors are as in bigbrothers:

Directory structure

Accepting network connections

It can be set up as xinetd service. With xinetd you need not startup scripts, however running php/cli can be a performance hit with many connections simultaneosly. Another way is running it as a daemon. It will give you a significant improvement in performance. How to run it:
php server-daemon.php [-d] <start|stop|restart>

Configuration files in etc/

etc/localhost

Used by agent and server. Server needs PHP variable, Agent needs LOCALHOST and SERVERS variables, and, possibly, PHP variable too. Actually it just overrides defaults from lib/sysconfig-common and lib/sysconfig-OperatingSystem.
# LOCALHOST used by agent to identify itself; defaults to output of `hostname` command
# LOCALHOST=some.host.name

# SERVERS lists servers where to send test data
# SERVERS="some.host another.host:<other port> third.host"
# SERVERS="localhost"

# PHP used by server (and possibly agent) when has to run php scripts
# PHP="/path/to/php"

etc/clients

Look for clients.sample to get idea what it is.

Consists of blocks that start with

[hostname]
or
[/<pcre>/]
that match a hostname followed by some check directives:
UP <uptime>
Not implemented yet
LOAD <yellow level> <red level>
# LOAD 1.0 2.0
Define 5min load average thresholds
DISK *|<mount point> <yellow level> <red level>
# DISK * 90 95
# DISK /usr 95 99
# DISK E: 95 99
# DISK /cdrom: 200 200 # never alert
Define free disk space thresholds in percent, as 'df' outputs.
* means all unspecified mountpoints, and has a lower priority.
URL <title> <http://some/url> [/green status pcre/] [/yellow status pcre/] [/red status pcre/]
# URL google http://www.google.com/ !Google!
If RE unsed, use empty expression //.
If green pcre present, it MUST match downloaded page.
If red pcre present and matches, it will cause status go red.
If yellow pcre present and matches, it will cause status go yellow.
No spaces allowed in RE, so use \s instead.
PROC <substring>|</pcre/> [min] [max] [color] [text="some text"]
Substring or pcre will match output from 'ps' utility. Substring can be enclosed into dobleqotes (").
min - integer; default - 1.
max - integer; default - 0 (unlimited).
color - red or yellow.
text= - description of process (defaults to process name/re).

etc/pages

If you want to have some separate presentation pages in the web interface, you may create file etc/pages. Sample format:
page / title="some title"
  group Group one
    one.host.name
    second.host.name
    /hostname pcre/
    /another hostname pcre/
  group Group two
    [ anoter hosts listed ]
page /subpage title="another title"
  group Group three
    [ anoter hosts listed ]
page /anothersubpage  tests=conn title="Connections"
  group-only conn Connection status
    [ anoter hosts listed ]
Possible lines:
page <path> [tests=test1,test2...|notests=test1,test2...] title="title"
group [group title]
group-only tests1|test2... [group title]
group-exclude tests1|test2... [group title]
hostname.somedomain.com
/somedomain\.com$/
Default is group "", and default page is "/". So when the file etc/pages does not exists, it is assumed it contains just one row:
/.*/

etc/hosts

This file specifies pinger.php which hosts to ping. By default every pass a host is sent 4 pings.

List of lines like this:

<hostname.domainname.tld> [<ip>] [dialup] [lost=<packets>]
If the hostname is used but can not be resolved, you have to provide it's IP address. 'Dialup' means that pinger will show grey color event instead of red in case no ping responce received at all. The lost packets define how many pings may be lost to show green color event, otherwise yellow. Red color means no responce at all. See etc/hosts.sample for example.

etc/alerts

If exists will cause email sent describing events when the event is 'red' or 'purple'. A list of lines:
alert [host=<host_selector>] [test=<test_selector>] mail=<recipient>[,recipient...]
host_selector: either coma separated list of hosts or /pcre/, defaults to /.*/.
test_selector: either coma separated list of tests or /pcre/, defaults to /.*/.
recipient is either user@domain.com for email or jid:user@domain.com for jabber notifications.

See etc/alerts.sample for description and examples.

etc/server.inc.php

This file defines Actually it just overrides defaults from lib/omonitor.inc.php.

See etc/server.inc.php.sample.

etc/client.inc.php

This one is used by agent if agent has more complex tests. So far it is used by 'slave' test, testing mysql slave connection status, which is performed by php rather then shell script.

See etc/client.inc.php.sample