David Baldwin | 1 Aug 2011 02:13
Picon

Re: Add page name to alert

On 30/07/11 6:21 AM, Prakash, Arjun wrote:

Hi ,

A newbie question:

I have hosts categorized in pages (in hosts.cfg) and want the email alerts to show the page name in the subject. Can this be done?

 



It can be done, but not easily.

You need to use a script to send your e-mail alert. There are some on xymonton - http://xymonton.org/alerts

As part of your script you will need to get the page name. It is available via xymondboard/hobbitdboard command using the XMH/BBH fields:

http://www.xymon.com/xymon/help/manpages/man5/xymon-xmh.5.html

Depending on your xymon version (4.3.0 or later/old version syntax below - substitute xymonhost for your server name - or use localhost if running on the xymon server):

xymon xymonhost "xymondboard host=$BBHOSTNAME test=info fields=XMH_PAGENAME"
bb xymonhost "hobbitdboard host=$BBHOSTNAME test=info fields=BBH_PAGENAME"

David.
-- David Baldwin - Assistant Director, Infrastructure (acting) Information and Communication Technology Services Australian Sports Commission http://ausport.gov.au Tel 02 62147830 Fax 02 62141830 PO Box 176 Belconnen ACT 2616 david.baldwin-vENfGX7yJcD7kZZSymWMOg@public.gmane.org Leverrier Street Bruce ACT 2617
Keep up to date with what's happening in Australian sport visit www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon
Henrik Størner | 1 Aug 2011 17:14
Picon

Re: GROUPs and recovery alerts

On 05-07-2011 11:58, Heather Keen wrote:

> Anyway, I think this is a BUG.
>
> Xymon Version 4.3.3.
> Configuration as follows:
>
> analysis.cfg:
> HOST=myhost.mydomain.com GROUP=heather
>          PROC TESTtestTEST 1
>
> alerts.cfg:
> HOST=*
>          MAIL heather1@... RECOVERED
>
> GROUP=heather
>          MAIL heather2@... RECOVERED
>
>
> When the alert is generated, both e-mail addresses get the notification.
>   But when the alert is cleared, only heather1@...
> <mailto:heather1@...> gets the recovery message.
>
> I've tried lots of different configuration options, and the only
> conclusion I can come to is that recovery messages to GROUPs do not
> work.  :(

It's certainly not what you would expect - must agree with that. But 
solving it is not quite as easy as one would expect.

The problem is that when the PROC triggers a red status, Xymon knows 
that the rule was one that included a "GROUP=heather" setting. But when 
the recovery happens, it is because none of the rules in analysis.cfg 
triggered. So Xymon does not know that the green status is a recovery 
from a rule that contained the GROUP setting.

There is some state lost here.

To solve this, the xymond_alert module will have to keep track of the 
active alerts, and which GROUP settings triggered them. When the 
recovery happens, it will then use that list of groups that received the 
alert as the basis for sending out the recovered-notices.

It can be solved, of course. Just don't be disappointed when you see 
4.3.4 being released later today without a fix for this problem.

Regards,
Henrik
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon

Henrik Størner | 1 Aug 2011 17:37
Picon

Re: GROUPs and recovery alerts

On 01-08-2011 17:14, Henrik Størner wrote:
> On 05-07-2011 11:58, Heather Keen wrote:
>> I've tried lots of different configuration options, and the only
>> conclusion I can come to is that recovery messages to GROUPs do not
>> work. :(
>
> It's certainly not what you would expect - must agree with that. But
> solving it is not quite as easy as one would expect.

After looking at this once again, I actually think there is a very 
simple solution to this after all. If we don't check the GROUP rules at 
all for recovery-messages (i.e. any group setting will match), then 
xymond_alert will consider all the possible recipients. However, there 
is another check so it only sends recovery-messages to those recipients 
that actually did receive the alert. So I think the attached patch 
should solve this.

Regards,
Henrik
Attachment (group-recovered.diff): text/x-patch, 797 bytes
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon
Henrik Størner | 1 Aug 2011 18:03
Picon

Re: Acknowledge Alert web page - patch for durations

On 05-07-2011 13:43, Heather Keen wrote:

> For anyone that is interested ... I've written a patch for the
> Acknowledge Alert page so that you can select a duration in mins, hours
> or days (for those of us who have long term problems!). Thought I'd
> share it with y'all.

Applied for 4.3.4.

Regards,
Henrik
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon

Josh Luthman | 1 Aug 2011 17:46

Re: Looking for "page too old" script

Got it working!  Thanks!!!

FILE    /var/www/inxwireless/network/output.png red mtime<1800

This is correct.  From what the documentation states, of which I may be understanding it incorrectly, this is backwards.

Initially I did mtime>1800 based on what I read on the man page.  The test went red "File was modified 31 seconds ago - should be >1800"

Josh Luthman
Office: 937-552-2340
Direct: 937-552-2343
1100 Wayne St
Suite 1337
Troy, OH 45373


On Fri, Jul 29, 2011 at 4:31 PM, Tim McCloskey <tm <at> freedom.com> wrote:
> Do I put this under [linux]?  If I do this, will it only have the clients with an appropriate FILE check or ask this file of every linux box?

Just took another look at a working config....here's a couple notes.
You don't need the whole [linux] class, you can use a [hostname] in client-local.cfg
...
[your-hostname]
file:/var/www/foo/bar/output.png
...

The stanza above with a matching FILE directive from hobbit-clients.cfg (both server side) works for me, on 4.2.0.

If you have this setup - sounds like you do - and it's not working you can Try touching the logfetch.$foo.cfg and logfetch.$foo.status files in the tmp dir on the client (create them however you need, just make sure to chown them to your hobbit user).

You should not need to populate the files, just create them. Example file...
cat someclienthostname:~hobbit/client/tmp/logfetch.someclienthostname.cfg
log:/some/path/to/somefile:10240
file:/bin/su

Tim



_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon
Rob McBroom | 1 Aug 2011 20:04
Favicon

Re: AIX memory checks

On Jun 29, 2011, at 9:15 AM, Rob McBroom wrote:

> The problem seems to have cleared on its own for now. If it reappears, I’ll send the output from that
command. Thanks.

OK, here's one. The report looks like this:

    hostname:memory red [884137]
    red Mon Aug  1 11:23:33 EDT 2011 - Memory CRITICAL
      Memory              Used       Total  Percentage
    &red Physical    18446744073709546509M      16384M18446744073709551585%
    &green Swap                 40M       4096M          0%

The vmstat section looks like this:

    [vmstat]

    System configuration: lcpu=16 mem=28672MB ent=0.50

    kthr    memory              page              faults              cpu          
    ----- ----------- ------------------------ ------------ -----------------------
     r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa    pc    ec
     1  1 1025924 5498980   0   0   0   0    0   0  25 2080 1221  8  4 88  0  0.10  19.6
     2  1 1026236 5498663   0   0   0   0    0   0  25 2152 1213  8  4 88  0  0.10  19.6

--

-- 
Rob McBroom
<http://www.skurfer.com/>

_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon

Christoph Schug | 1 Aug 2011 20:32

Xymon PROC check fails on un-aligned ps(1) output

If have got a question regarding Xymon 4.3.3 (running on CentOS
5.6/x86_64). In order to monitor the existence of certain processes like
rsyslogd(8) I have following process rule defined in analysis.cfg:

CLASS=linux
    PROC     "%^/sbin/rsyslogd -m 0$"

This works fine as long as the columns in the output of ps(1) (more
specific “ps -Aww -o
pid,ppid,user,start,state,pri,pcpu,time,pmem,rsz,vsz,cmd” as defined in
xymonclient-linux.sh) are all nicely aligned.

  PID  PPID USER      STARTED S PRI %CPU     TIME %MEM   RSZ    VSZ CMD
[...]
 4620  4607 68         Jun 17 S  22  0.0 00:00:00  0.0   860  12348
hald-addon-keyboard: listening on /dev/input/event0
 4709     1 root       Jun 17 S  17  0.0 00:00:00  0.0   496   8540
/usr/bin/hidd --server
 4739     1 root       Jun 17 S  21  0.0 00:11:14  0.0  3576 300132
/sbin/rsyslogd -m 0
 6894     1 root       Jun 17 S  18  0.0 00:00:00  0.0  1540 122008
automount
 6918     1 root       Jun 17 S  24  0.0 00:00:08  0.0  1224  63544
/usr/sbin/sshd

The trouble starts when the process in question runs long enough (as seen
on a different machine) so it does fit the reserved columns for that
specific field, disturbing to whole output (process runtime is just one
example, I suppose any value growing big enough to not fit anymore the
reserved space would do to exploit this behavior):

  PID  PPID USER      STARTED S PRI %CPU     TIME %MEM   RSZ    VSZ CMD
[...]
 5377     1 root       May 24 S  21  0.0 00:00:00  0.0   444   3816
/sbin/mingetty tty4
 5378     1 root       May 24 S  20  0.0 00:00:00  0.0   444   3816
/sbin/mingetty tty5
 5380     1 root       May 24 S  19  0.0 00:00:00  0.0   444   3816
/sbin/mingetty tty6
 5382     1 root       May 24 S  22  0.0 00:00:00  0.0   496   3824
/sbin/agetty 9600 ttyS1 vt100
 8734     1 root       Jun 20 S  21  7.7 3-06:51:29  0.1 48640 292664
/sbin/rsyslogd -m 0
20468   262 root       Jul 19 S  24  0.0 00:04:01  0.0     0      0
[pdflush]

In this case the above regex does not seem to match anymore, because
(apparently) the matching starts at some fixed column value. Just for fun
and to double check I enhanced the process rule set by another rule:

CLASS=linux
    PROC     "%^/sbin/rsyslogd -m 0$"
    PROC     "%^[0-9]+ /sbin/rsyslogd -m 0$"

After doing so, indeed the first rule still fails while the second rule
matches. So apparently the last digit of the VSZ field of rsyslogd(8)
sneaked into the CMD field and gets matched by the PROC check. Is this a
known bug, and if yes is there a good workaround for that apart from
invoking a wrapper script in xymonclient-linux.sh which mangels the output
of ps(1) accordingly?

Thanks in advance!
-cs

_______________________________________________
Xymon mailing list
Xymon <at> xymon.com
http://lists.xymon.com/mailman/listinfo/xymon
Christoph Schug | 1 Aug 2011 20:59

Missing mapping in docs/Renaming-430.txt upgrade document

You might want to add this to the upgrade document

--- xymon-4.3.3/docs/Renaming-430.txt.orig      2011-03-08
18:20:28.000000000 +0100
+++ xymon-4.3.3/docs/Renaming-430.txt   2011-08-01 20:58:20.000000000 +0200
 <at>  <at>  -144,6 +144,7  <at>  <at> 
 BBHOSTHISTLOG                  XYMONHOSTHISTLOG
 BBHOSTS                                HOSTSCFG
 BBHTML                         XYMONHTMLSTATUSDIR
+BBLOCATION                      XYMONNETWORK
 BBLOGSTATUS                    XYMONLOGSTATUS
 BBLOGS                         XYMONRAWSTATUSDIR
 BBMAXMSGSPERCOMBO              MAXMSGSPERCOMBO

Cheers
-cs
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon

Christoph Schug | 1 Aug 2011 21:07

Re: Xymon PROC check fails on un-aligned ps(1) output

On Mon, 01 Aug 2011 20:32:10 +0200, Christoph Schug <cs@...> wrote:
> If have got a question regarding Xymon 4.3.3 (running on CentOS
> 5.6/x86_64). In order to monitor the existence of certain processes like
> rsyslogd(8) I have following process rule defined in analysis.cfg:
> 
> CLASS=linux
>     PROC     "%^/sbin/rsyslogd -m 0$"
[...]

I was asked off the list (thanks, but honestly I hope the benefit for all
of us is higher of discussion keeps on the list):

"Why not just dispense with the '^'? That way the RE will match regardless
of where it starts. "

I'd like to have most exact matching on all my processes. rsyslogd(8) is
just an example, same applies for example to shell scripts which run for a
very long time or as daemon. So I prefer rather

     PROC     "%^/foo/bar$"

instead of just

     PROC     "/foo/bar"

or a somehow relaxed regex, because otherwise a local use might have a look
at the script using more(1), but I don't want to have the process
monitoring matching such thinks like "more /foo/bar". This is reporting
wrong numbers, or might even report the check as GREEN while the instance
which is intended to run doesn't so anymore.

-cs

_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon

Henrik Størner | 1 Aug 2011 22:38
Picon

Re: AIX memory checks

> The vmstat section looks like this:
>
>      [vmstat]
>
>      System configuration: lcpu=16 mem=28672MB ent=0.50
>

What does the "[realmem]" section look like ? If you look in the 
xymonclient-aix.sh script, you'll see that it uses these commands:

echo "[realmem]"
lsattr -El sys0 -a realmem
echo "[freemem]"
vmstat 1 2 | tail -1
echo "[swap]"
lsps -s

All of this is in the "client data" linked to from the detailed status 
page. It would be nice to have an example both when it is green and when 
it is red.

Regards,
Henrik
_______________________________________________
Xymon mailing list
Xymon@...
http://lists.xymon.com/mailman/listinfo/xymon


Gmane