Bill | 3 Nov 2005 18:12

Prefixing the alert subject and openbsd install tutorial


[sorry to the moderator who has to deal with my post from an
unregistered account]

Thanks for this program - I've been using it for probably about two
years now.  In the last few months I've been starting to do more than
the basics (dependencies, etc) which has been fun.

I have two questions:

 I am monitoring from a bunch of systems and would love to have a way
to have the ALERT on my mail alerts have something defined before or
after it I can set from the mon.cf file.

So instead of ALERT watchname/server ...

I could have

	CONAME:ALERT watchname/server

Where CONAME would be a company / division name / Whatever

Obviously I could go in and change the alert.mail code on each
monitoring station, but that forks the code.  I could also create a new
alert.mail alert and use that... 

Is this something that if its not possible a patch would be accepted
for? Should I just modify the alert for mail?

My second question / comment is that I have written up a how-to for
(Continue reading)

Bill | 3 Nov 2005 18:29

Alerts coming too often... why?


I seem to be having a problem with alerts being send too often now
(alert added below).

These are part of a ping that should not send more than once an hour.
I have the mon.cf set up with what I think is correct, but maybe I am
missing something.  (mon.cf below).

I have it set to alert every 60minutes, but I get them about every 5
minutes.  In reading the doc's I noticed the results have to be the
same for each entry otherwise it resents.  I noticed the fping entries
in my log are different.

Would that be causing it?

I was just manually doing the fping to see if I get output back each
time (or occassionally), but some damn fool actually went and reset the
node on me - so its up.  

This is the alert I get
---------------------------------------------
Summary output	      : 10.4.60.1

Group                 : goodwinwifi
Service               : ping
Time noticed          : Thu Nov  3 12:15:47 2005
Secs until next alert : 
Members 	      : 10.4.60.1 10.4.61.1 10.4.62.1 10.4.63.1
10.4.64.1

(Continue reading)

Ed Ravin | 3 Nov 2005 19:10
Picon
Favicon

Re: Prefixing the alert subject

On Thu, Nov 03, 2005 at 12:12:00PM -0500, Bill wrote:
> 
> I have two questions:
> 
>  I am monitoring from a bunch of systems and would love to have a way
> to have the ALERT on my mail alerts have something defined before or
> after it I can set from the mon.cf file.

Quick clip from the Mon man page:

     As  with  monitor programs, alert programs are invoked with environment
     variables defined by the user in the service definition, in addition to
     the following which are explicitly set by the server:

Here's an example from one of my configs:

    service freespace
    description Is there 5GB free? Enough inodes?
    depend SELF:ping
    MIBDIRS=/usr/local/share/snmp/local-mibs:/usr/local/share/snmp/mibs
    interval 7m
    monitor netappfree.monitor

In this case, the monitor script won't work properly without MIBDIRS
defined.  You can use this feature to pass environment vars into your
script, so the same alert script could take different actions or send
different messages based on the contents of an environment var.
David Nolan | 3 Nov 2005 19:34
Picon
Favicon

Re: Alerts coming too often... why?


--On Thursday, November 03, 2005 12:29:55 -0500 Bill <Bill <at> explosivo.com> 
wrote:

> I have it set to alert every 60minutes, but I get them about every 5
> minutes.  In reading the doc's I noticed the results have to be the
> same for each entry otherwise it resents.  I noticed the fping entries
> in my log are different.

Yes, the default behavior is that if the summary of the failure changes a 
new alert should be generated.  If you're running the current Mon from CVS 
you can control that by saying 'alertevery 60m strict'.

Alternatively you could figure out what your alert is generating 
inconsistent output.  Based on this string from your syslog output, 
"unidentified output from fping", I'm guessing your alert script isn't 
corretly processing all of the fping output.  I believe you might need a 
newer version of the fping.monitor script.  If the latest version from CVS 
doesn't help send us the version iformation for you version of fping and 
we'll see if we can fix it.

-David

David Nolan                    <*>                    vitroth+ <at> cmu.edu
curses: May you be forced to grep the termcap of an unclean yacc while
      a herd of rogue emacs fsck your troff and vgrind your pathalias!
Bill | 3 Nov 2005 19:44

Re: Prefixing the alert subject

On Thu, 3 Nov 2005 13:10:18 -0500
Ed Ravin <eravin <at> panix.com> spake:

> On Thu, Nov 03, 2005 at 12:12:00PM -0500, Bill wrote:
> > 
> > I have two questions:
> > 
> >  I am monitoring from a bunch of systems and would love to have a way
> > to have the ALERT on my mail alerts have something defined before or
> > after it I can set from the mon.cf file.
> 
> Quick clip from the Mon man page:
> 
>      As  with  monitor programs, alert programs are invoked with environment
>      variables defined by the user in the service definition, in addition to
>      the following which are explicitly set by the server:
> 
> Here's an example from one of my configs:
> 
>     service freespace
>     description Is there 5GB free? Enough inodes?
>     depend SELF:ping
>     MIBDIRS=/usr/local/share/snmp/local-mibs:/usr/local/share/snmp/mibs
>     interval 7m
>     monitor netappfree.monitor
> 
> In this case, the monitor script won't work properly without MIBDIRS
> defined.  You can use this feature to pass environment vars into your
> script, so the same alert script could take different actions or send
> different messages based on the contents of an environment var.
(Continue reading)

Bill | 3 Nov 2005 19:49

Re: Alerts coming too often... why?

On Thu, 03 Nov 2005 13:34:58 -0500
David Nolan <vitroth+ <at> cmu.edu> spake:
> 
> 
> --On Thursday, November 03, 2005 12:29:55 -0500 Bill <Bill <at> explosivo.com> 
> wrote:
> 
> > I have it set to alert every 60minutes, but I get them about every 5
> > minutes.  In reading the doc's I noticed the results have to be the
> > same for each entry otherwise it resents.  I noticed the fping entries
> > in my log are different.
> 
> Yes, the default behavior is that if the summary of the failure changes a 
> new alert should be generated.  If you're running the current Mon from CVS 
> you can control that by saying 'alertevery 60m strict'.
> 
> Alternatively you could figure out what your alert is generating 
> inconsistent output.  Based on this string from your syslog output, 
> "unidentified output from fping", I'm guessing your alert script isn't 
> corretly processing all of the fping output.  I believe you might need a 
> newer version of the fping.monitor script.  If the latest version from CVS 
> doesn't help send us the version iformation for you version of fping and 
> we'll see if we can fix it.

I was travelling down that road when somebody finally fixed the problem
(after 2 days) and my intermitten target went away.  It's bound to
happen again though - the wifi nodes seems to be in a tizzy the past
two weeks (brownouts I think).

So is the cvs relatively stable?  Mon is not mission critical stuff
(Continue reading)

David Nolan | 3 Nov 2005 20:04
Picon
Favicon

Re: Alerts coming too often... why?


--On Thursday, November 03, 2005 13:49:15 -0500 Bill <Bill <at> explosivo.com> 
wrote:

> So is the cvs relatively stable?  Mon is not mission critical stuff
> here, so I'd be more than happy to run that on a bunch of machines.
>
> Right now I am on 0.99.2
>
> I was eyeing CVS the other day...  debating it.

CVS is definitely more stable then 0.99.2.  0.99.2 has some nasty bugs, 
including some crash and burn type bugs.

I need to spend some time integrating some last bug fixes to CVS and then 
we're ready to call it a release.

-David

David Nolan                    <*>                    vitroth+ <at> cmu.edu
curses: May you be forced to grep the termcap of an unclean yacc while
      a herd of rogue emacs fsck your troff and vgrind your pathalias!
Ed Ravin | 3 Nov 2005 20:06
Picon
Favicon

Re: Prefixing the alert subject

On Thu, Nov 03, 2005 at 01:44:21PM -0500, Bill wrote:
[about including environment variables in the alert subject]

[...]
> > In this case, the monitor script won't work properly without MIBDIRS
> > defined.  You can use this feature to pass environment vars into your
> > script, so the same alert script could take different actions or send
> > different messages based on the contents of an environment var.
> 
> Yeah, I was hoping that there was already someway to do it without
> creating another alert program - I loath re-inventing the wheel so to
> speak.  But have no problem doing so if it has not been done.  

I have something like it on my site in our locally modified copy of
mail.alert:

   $desc= $ENV{'MON_DESCRIPTION'} || "";

   [...]

        $ALERT= "ALERT";
        $t= localtime($failtime);
        $downmsg= "Down for $downtime seconds";
        $downmsg .= "\n\nNotes: $desc\n" if length($desc);

In this case, I'm using the "description" field in the mon config,
which is also viewable in the GUI.  I use this field for suggestions
on what to do if the service goes down, which helps a lot when someone
less familar with the system has to handle an alarm at 3 AM.

(Continue reading)

Frank Isemann | 3 Nov 2005 21:12

last_output

hi :)

is there anything like last_output?

last_summary is the first line of the monitor output ... but i want more ;>

i found "last_detail" but that is "last_detail='\0a'" at my service that
return more than one line (without a service failure state)

at mon.cgi: *Detail output from the most recent failure of this service*
(last_detail)

is that only filled if the service returns a > 0  exit value?

PS: mon is fantastic!!

greetz frank
Ed Ravin | 3 Nov 2005 21:37
Picon
Favicon

Re: last_output

On Thu, Nov 03, 2005 at 09:12:40PM +0100, Frank Isemann wrote:
> 
> is there anything like last_output?
> 
> last_summary is the first line of the monitor output ... but i want more ;>

Mon 1.1 has the env var MON_LAST_OUTPUT, but I don't think my fixes that
actually make it work are in CVS yet.

> i found "last_detail" but that is "last_detail='\0a'" at my service that
> return more than one line (without a service failure state)
> 
> at mon.cgi: *Detail output from the most recent failure of this service*
> (last_detail)
> 
> is that only filled if the service returns a > 0  exit value?

No, that should be valid regardless of exit status.  Works for me
with fping.monitor, where you see detailed stats of all the hosts
pinged even if all of them responded properly.

Gmane