Servers accept messages through xinetd/acceptor or through daemon, run test if needed and put event into database. Server side also runs scritps that monitor URLs, ping hosts, check for events not being updated (cleaner), those scripts put information directly to database through API. After event put to database, email alert processing happens.
Server may forward messages to another servers, in this case messages pass to remote servers unprocessed, so remote server will process them according to its settings. Forwarding is helpful when pinging hosts from remote locations and passing results to another server, which may not be able to ping those hosts directly.
Server contains database, and can present it via web interface.
Windows agent acts like a server itself, running as service, and regularly collecting data from remote stations via DCOM/WMI, and sends unprocessed data to server.
There's a concept of 'disabling' test. When it is disabled, it does not change the color/status of the whole page, so it's like confirming that you know the problem, and someone works on it. When the test goes green again, it will be enabled automatically, so next time it goes red, it will change the status of the whole page.
Event colors are as in bigbrothers:
bin/- all the commands ans scripts
etc/- all the configuration and configuration samples
lib/- the heart of engine, default settings, all the libraries and classes
logs/- where you look for problems
tmp/- where some working stuff is kept
win/- what you need to have on windows agent
www/- web interface part
php server-daemon.php [-d] <start|stop|restart>
# LOCALHOST used by agent to identify itself; defaults to output of `hostname` command # LOCALHOST=some.host.name # SERVERS lists servers where to send test data # SERVERS="some.host another.host:<other port> third.host" # SERVERS="localhost" # PHP used by server (and possibly agent) when has to run php scripts # PHP="/path/to/php"
clients.sampleto get idea what it is.
Consists of blocks that start with
[/<pcre>/]that match a hostname followed by some check directives:
UP <uptime>Not implemented yet
LOAD <yellow level> <red level> # LOAD 1.0 2.0Define 5min load average thresholds
DISK *|<mount point> <yellow level> <red level> # DISK * 90 95 # DISK /usr 95 99 # DISK E: 95 99 # DISK /cdrom: 200 200 # never alertDefine free disk space thresholds in percent, as 'df' outputs.
URL <title> <http://some/url> [/green status pcre/] [/yellow status pcre/] [/red status pcre/] # URL google http://www.google.com/ !If RE unsed, use empty expression //.
PROC <substring>|</pcre/> [min] [max] [color] [text="some text"]Substring or pcre will match output from 'ps' utility. Substring can be enclosed into dobleqotes (").
etc/pages. Sample format:
page / title="some title" group Group one one.host.name second.host.name /hostname pcre/ /another hostname pcre/ group Group two [ anoter hosts listed ] page /subpage title="another title" group Group three [ anoter hosts listed ] page /anothersubpage tests=conn title="Connections" group-only conn Connection status [ anoter hosts listed ]Possible lines:
page <path> [tests=test1,test2...|notests=test1,test2...] title="title" group [group title] group-only tests1|test2... [group title] group-exclude tests1|test2... [group title] hostname.somedomain.com /somedomain\.com$/Default is group "", and default page is "/". So when the file
etc/pagesdoes not exists, it is assumed it contains just one row:
pinger.phpwhich hosts to ping. By default every pass a host is sent 4 pings.
List of lines like this:
<hostname.domainname.tld> [<ip>] [dialup] [lost=<packets>]If the hostname is used but can not be resolved, you have to provide it's IP address. 'Dialup' means that pinger will show grey color event instead of red in case no ping responce received at all. The lost packets define how many pings may be lost to show green color event, otherwise yellow. Red color means no responce at all. See
alert [host=<host_selector>] [test=<test_selector>] mail=<recipient>[,recipient...]host_selector: either coma separated list of hosts or /pcre/, defaults to /.*/.
email@example.com email or
jid:firstname.lastname@example.org jabber notifications.
etc/alerts.sample for description and examples.