Osburn, Michael | 7 Dec 2007 00:18
Favicon

mon not reporting on localhost


Hello all,


        I have recently installed mon via RPM out of the dag repository, and brought the checks down to a single test in order to make sure everything is setup correctly before I start digging to deeply into creating my own monitors and alerts. The only check that I am running right now is http.monitor out of the stable tar ball which alerts to mail.alert. If I run the monitor and alert scripts from the command line they work correctly, however if I start mon and shutdown apache on the monitored host I do not get alerted. Watching mon.cgi while the http service never leaves the unchecked state.  Is there something that I am missing in my configurations that gets mon to show  the information correctly with what I want (letting me know if apache is still alive)?


Thanks,

Michael Osburn

Misc. Info about my setup


[root <at> localhost ~]# uname -a
Linux localhost.localdomain 2.6.18-8.1.15.el5xen #1 SMP Mon Oct 22 09:01:12 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
[root <at> localhost ~]# mon -v
$Id: mon,v 1.22.2.2 2007/06/06 11:46:19 trockij Exp $
$Name: mon-1-2-0-release $
[root <at> localhost ~]# cat /etc/mon/mon.cf
### Extremely basic mon.cf file

### global options
cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/lib/mon/state.d
logdir      = /var/lib/mon/log.d
dtlogfile   = /var/lib/mon/log.d/downtime.log
alertdir    = /usr/lib64/mon/alert.d
mondir      = /usr/lib64/mon/mon.d
maxprocs    = 20
histlength  = 100
randstart   = 60s
authtype    = pam
userfile    = /etc/mon/userfile
monerrfile  = /var/lib/mon/log.d/error
dtlogging   = yes

### group definitions (hostnames or IP addresses)
hostgroup servers 127.0.0.1


watch servers
    service http
        interval 5m
        monitor http.monitor
        period wd {Mon-Fri} hr {7am-10pm}
            alert mail.alert root <at> localhost
            alertevery 1m
        period wd {Sat-Sun}
            alert mail.alert root <at> localhost
[root <at> localhost ~]#



_______________________________________________
mon mailing list
mon <at> linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon
Ed Ravin | 7 Dec 2007 01:00
Picon
Favicon

Re: mon not reporting on localhost

On Thu, Dec 06, 2007 at 04:18:01PM -0700, Osburn, Michael wrote:
...
>    ### group definitions (hostnames or IP addresses)
>    hostgroup servers 127.0.0.1
>    watch servers
>        service http
>            interval 5m
>            monitor http.monitor
>            period wd {Mon-Fri} hr {7am-10pm}
>                alert mail.alert root <at> localhost
>                alertevery 1m
>            period wd {Sat-Sun}
>                alert mail.alert root <at> localhost
...

You need a blank line after every hostgroup definition, as described
in the mon man page:

 Hostgroup Entries
     Hostgroup entries begin with the keyword hostgroup, and are followed by
     a hostgroup tag and one or more hostnames or IP addresses, separated by
     whitespace.  The hostgroup tag must be composed of alphanumeric charac-
     ters, a dash ("-"), a period ("."), or an underscore  ("_").  Non-blank
     lines  following the first hostgroup line are interpreted as more host-
     names.  The hostgroup definition ends with a blank line. For example:

            hostgroup servers nameserver smtpserver nntpserver
                 nfsserver httpserver smbserver

            hostgroup router_group cisco7000 agsplus
Osburn, Michael | 7 Dec 2007 17:59
Favicon

RE: mon not reporting on localhost

Here is the latest configuration file on the server. Accordign to mon.cgi service http is still showing up as unchecked.

Thanks for looking this over


Michael


-----Original Message-----
From: mon-bounces <at> linux.kernel.org on behalf of Osburn, Michael
Sent: Thu 12/6/2007 4:18 PM
To: mon <at> linux.kernel.org
Subject: mon not reporting on localhost


Hello all,


        I have recently installed mon via RPM out of the dag repository, and brought the checks down to a single test in order to make sure everything is setup correctly before I start digging to deeply into creating my own monitors and alerts. The only check that I am running right now is http.monitor out of the stable tar ball which alerts to mail.alert. If I run the monitor and alert scripts from the command line they work correctly, however if I start mon and shutdown apache on the monitored host I do not get alerted. Watching mon.cgi while the http service never leaves the unchecked state.  Is there something that I am missing in my configurations that gets mon to show  the information correctly with what I want (letting me know if apache is still alive)?


Thanks,

Michael Osburn

Misc. Info about my setup


[root <at> localhost ~]# uname -a
Linux localhost.localdomain 2.6.18-8.1.15.el5xen #1 SMP Mon Oct 22 09:01:12 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
[root <at> localhost ~]# mon -v
$Id: mon,v 1.22.2.2 2007/06/06 11:46:19 trockij Exp $
$Name: mon-1-2-0-release $
[root <at> localhost ~]# cat /etc/mon/mon.cf
### Extremely basic mon.cf file

### global options
cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/lib/mon/state.d
logdir      = /var/lib/mon/log.d
dtlogfile   = /var/lib/mon/log.d/downtime.log
alertdir    = /usr/lib64/mon/alert.d
mondir      = /usr/lib64/mon/mon.d
maxprocs    = 20
histlength  = 100
randstart   = 60s
authtype    = pam
userfile    = /etc/mon/userfile
monerrfile  = /var/lib/mon/log.d/error
dtlogging   = yes

### group definitions (hostnames or IP addresses)
hostgroup servers 127.0.0.1


watch servers
    service http
        interval 5m
        monitor http.monitor
        period wd {Mon-Fri} hr {7am-10pm}
            alert mail.alert root <at> localhost
            alertevery 1m
        period wd {Sat-Sun}
            alert mail.alert root <at> localhost
[root <at> localhost ~]#





Attachment (mon.cf): application/octet-stream, 749 bytes
_______________________________________________
mon mailing list
mon <at> linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon
Jim Trocki | 7 Dec 2007 18:55

RE: mon not reporting on localhost

On Fri, 7 Dec 2007, Osburn, Michael wrote:

> Here is the latest configuration file on the server. Accordign to mon.cgi service http is still showing up
as unchecked.
>
> Thanks for looking this over

Apply this diff to the file "mon" from mon-1.2.0, then try it again and have a
look at your /var/log/messages for a complaint about not being able to find
http.monitor.
--- mon	2007-06-06 07:46:19.000000000 -0400
+++ /home/trockij/mon/1.2/mon	2007-12-07 12:49:50.000000000 -0500
 <at>  <at>  -4,7 +4,7  <at>  <at> 
 #
 # Jim Trocki, trockij <at> arctic.org
 #
-# $Id: mon,v 1.22.2.2 2007/06/06 11:46:19 trockij Exp $
+# $Id: mon,v 1.22.2.3 2007/12/05 17:54:55 aschwer Exp $
 #
 # Copyright (C) 1998 Jim Trocki
 #
 <at>  <at>  -25,9 +25,9  <at>  <at> 
 #
 use strict;
 
-my $RCSID='$Id: mon,v 1.22.2.2 2007/06/06 11:46:19 trockij Exp $';
+my $RCSID='$Id: mon,v 1.22.2.3 2007/12/05 17:54:55 aschwer Exp $';
 my $AUTHOR='trockij <at> arctic.org';
-my $RELEASE='$Name: mon-1-2-0-release $';
+my $RELEASE='$Name:  $';
 
 #
 # NetBSD rc.d script compatibility
 <at>  <at>  -5392,11 +5392,13  <at>  <at> 
 
 no warnings; # Redefining syslog
 sub syslog {
-   eval {
-       local $SIG{"__DIE__"}= sub { }; 
-       my  <at> log = map { s/\%//mg; }  <at> _;
-       Sys::Syslog::syslog( <at> log);
-   }
+    return if (! <at> _);
+
+    my $pri = ( <at> _ == 1) ? "err" : shift;
+
+    eval { Sys::Syslog::syslog ($pri, '%s', " <at> _"); };
+
+    print STDERR "syslog error: $ <at> \n" if ($ <at>  ne "");
 }
 use warnings;
 
_______________________________________________
mon mailing list
mon <at> linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon
Osburn, Michael | 7 Dec 2007 22:11
Favicon

RE: mon not reporting on localhost

Jim,


Thanks for the patch. Not sure what caused the issue but when I replaced the mon executable with the one from 1.2.0 (patch applied) I started getting alerts upon reload. I have seen about 5 errors show up in the logs stating
Dec  7 13:03:23 localhost mon[13490]: no monitor found while trying to run [http.monitor]
but everything shows to be running correctly. I will keep an eye on this. I am going to be building a second server in a diffrent section of the network and see if the issue is with the rpm I downloaded.

Thanks.

Michael


-----Original Message-----
From: Jim Trocki [mailto:trockij <at> arctic.org]
Sent: Fri 12/7/2007 10:55 AM
To: Osburn, Michael
Cc: mon <at> linux.kernel.org
Subject: RE: mon not reporting on localhost

On Fri, 7 Dec 2007, Osburn, Michael wrote:

> Here is the latest configuration file on the server. Accordign to mon.cgi service http is still showing up as unchecked.
>
> Thanks for looking this over

Apply this diff to the file "mon" from mon-1.2.0, then try it again and have a
look at your /var/log/messages for a complaint about not being able to find
http.monitor.

_______________________________________________
mon mailing list
mon <at> linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon
Jim Trocki | 7 Dec 2007 22:58

RE: mon not reporting on localhost

On Fri, 7 Dec 2007, Osburn, Michael wrote:

> Thanks for the patch. Not sure what caused the issue but when I
> replaced the mon executable with the one from 1.2.0 (patch applied)
> I started getting alerts upon reload. I have seen about 5 errors show
> up in the logs stating

> Dec  7 13:03:23 localhost mon[13490]: no monitor found while trying
> to run [http.monitor]

Yes, this means it didn't find http.monitor in either the default place or the
place where you told it to look.

If you start mon like "./mon -d -c /path/to/mon.cf" it will dump out the list
of places it's looking. The directory you're looking for is labeled "mondir" in
the config file, and usually it's something like "/usr/lib/mon/mon.d". In the
"-d" debug output it is incorrectly labeled "scriptdir" (which I just noticed).

You can either just copy your http.monitor into the place it's looking, or
better yet, just make the "mondir" path point to the "mon.d" directory that
comes with mon.

When mon starts up or is told to reload its configuration, it looks for every
"monitor" and "alert" in your config in the "mondir" and "alertdir" paths,
respectively. You can specify multiple directories separated by a colon, like
"mondir = /usr/local/mystuff/mon.d:/usr/lib/mon", and you'd put your own
homebrew or site-specific monitors in "/usr/local/mystuff/mon.d", where
it will look first before looking in the other location.
Osburn, Michael | 7 Dec 2007 23:56
Favicon

RE: mon not reporting on localhost

The odd thing about that is that the script had not moved and every other time it ran the script was found and run correctly, just the few instances that it forgot that it was there. I have been severly thrashing the disks all afternoon and causing all sorts of error attempting to  cause issues with the (artifical) load to check on some home brew parts of the http.monitor file. This has been causing the server to see loads almost 200% higher then it would in production and causing the array to lock up its read cache. This part is my errors on it.

Thanks for putting out such a useful tool.


-----Original Message-----
From: Jim Trocki [mailto:trockij <at> arctic.org]
Sent: Fri 12/7/2007 2:58 PM
To: Osburn, Michael
Cc: mon <at> linux.kernel.org
Subject: RE: mon not reporting on localhost

On Fri, 7 Dec 2007, Osburn, Michael wrote:

> Thanks for the patch. Not sure what caused the issue but when I
> replaced the mon executable with the one from 1.2.0 (patch applied)
> I started getting alerts upon reload. I have seen about 5 errors show
> up in the logs stating

> Dec  7 13:03:23 localhost mon[13490]: no monitor found while trying
> to run [http.monitor]

Yes, this means it didn't find http.monitor in either the default place or the
place where you told it to look.

If you start mon like "./mon -d -c /path/to/mon.cf" it will dump out the list
of places it's looking. The directory you're looking for is labeled "mondir" in
the config file, and usually it's something like "/usr/lib/mon/mon.d". In the
"-d" debug output it is incorrectly labeled "scriptdir" (which I just noticed).

You can either just copy your http.monitor into the place it's looking, or
better yet, just make the "mondir" path point to the "mon.d" directory that
comes with mon.

When mon starts up or is told to reload its configuration, it looks for every
"monitor" and "alert" in your config in the "mondir" and "alertdir" paths,
respectively. You can specify multiple directories separated by a colon, like
"mondir = /usr/local/mystuff/mon.d:/usr/lib/mon", and you'd put your own
homebrew or site-specific monitors in "/usr/local/mystuff/mon.d", where
it will look first before looking in the other location.


_______________________________________________
mon mailing list
mon <at> linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon
Jacques Klein | 10 Dec 2007 16:13

Using depend ...

I added "depend ...." lines to some of my services in my mon.cf, and 
also "dep_behavior = hm"
I am using mon-1.2.0 where I added
    use strict;
    use warnings;
for debugging purpose.

I get
   Use of uninitialized value in concatenation (.) or string at 
/symlnks/common/dsnmon/2.0/mond line 5324
This happens at
    } elsif ($watch{$group}->{$service}{"_last_failure_time"} >= (time - 
$watch{$group}->{$service}{"dep_memory"})) {
, it seems that $watch{$group}->{$service}{"_last_failure_time"} is 
undefined.

The unix command "grep _last_failure_time mon"
            $depval = $SUCCESS{$sref->{"_op_status"}} && 
($sref->{"_last_failure_time"} < (time - $sref->{"dep_memory"}));
                      && ($sref->{"_last_failure_time"} < (time - 
$sref->{"dep_memory"}));
        } elsif ($watch{$group}->{$service}{"_last_failure_time"} >= 
(time - $watch{$group}->{$service}{"dep_memory"})) {

shows that    $sref->{"_last_failure_time"}     is used but never set !!!!.
Jim Trocki | 10 Dec 2007 19:49

Re: Using depend ...

On Mon, 10 Dec 2007, Jacques Klein wrote:

> also "dep_behavior = hm"
> I am using mon-1.2.0 where I added

>
> shows that    $sref->{"_last_failure_time"}     is used but never set !!!!.
>

Try this patch and let me know if it helps:

--- mon	2007-12-10 13:35:43.000000000 -0500
+++ mon-dep	2007-12-10 13:38:48.000000000 -0500
 <at>  <at>  -1392,6 +1392,7  <at>  <at> 
  		$sref->{"_start_of_monitor"} = time if (!defined($sref->{"_start_of_monitor"}));
  		$sref->{"_alert_count"} = 0 if (!defined($sref->{"_alert_count"}));
  		$sref->{"_last_failure"} = 0 if (!defined($sref->{"_last_failure"}));
+		$sref->{"_last_failure_time"} = 0 if (!defined($sref->{"_last_failure_time"}));
  		$sref->{"_last_success"} = 0 if (!defined($sref->{"_last_success"}));
  		$sref->{"_last_trap"} = 0 if (!defined($sref->{"_last_trap"}));
  		$sref->{"_last_traphost"} = '' if (!defined($sref->{"_last_traphost"}));
 <at>  <at>  -3287,6 +3288,7  <at>  <at> 
  	$sref->{"_failure_count"}++;
  	$sref->{"_consec_failures"}++;
  	$sref->{"_last_failure"} = $tmnow;
+	$sref->{"_last_failure_time"} = $tmnow; # used by the "dep_memory" option
  	if ($sref->{"_op_status"} == $STAT_OK ||
  		$sref->{"_op_status"} == $STAT_UNKNOWN ||
  		$sref->{"_op_status"} == $STAT_UNTESTED)
Alex Moen | 27 Dec 2007 17:18
Favicon

cisco_interface.monitor

Hi all,

Looking to see if anyone has updated the cisco_interface.monitor to work with "modern" perl
implementations.  I am by no means a perlmonger, so am 
just trying to use it to monitor my Cisco interfaces, but cannot seem to make it work no matter what version of
Net-SNMP that I install, so I figure 
that it is probably written for an older version of Net-SNMP and is having "backwards compatibility"
issues.  Namely, the script complains of the 
unavailability of SNMP::Session (or SNMP/Session.pm), which does not seem to exist in the newer versions
of Net-SNMP.

Has anyone else been able to overcome this problem?  If so, how?  I really would like to use Mon to watch my
interfaces, possibly as a replacement for 
HP's OpenView Network Node Manager, if possible.

Thanks for any input that you can give!

Gmane