Matt Lawson | 5 Apr 15:44 2005

FreeBSD stops issuing probes...

Hi,
This may be the same issue as "Daemons Stops - No graphs created - 
after few hours ", or maybe not, but I want to clarify what I see...

Smokeping works fine for several hours, then we start getting alert 
emails saying 100% loss of information.  The smokeping process itself 
is still running (if you try to start it, it will say 'another 
smokeping process already running') and it continues to send out the 
alert emails.  However, no probes seem to be happening.

Now the probe that I'm using is one I wrote called CheckTcp.  It's a 
'one at a time' type probe and therefore uses the baseforks class.  
Currently I have forks=2 although I have also tried at forks=5 with the 
same result.  It is running with a timeout of 2s, x 4 pings and a 
number of targets up in the hundreds.

The last probe activity I see in the log is this (machine names and 
ip's masked out):

Apr  5 12:27:57 smokeping[25726]: CheckTcp: [addr1]: timeout (8 s) 
reached, killing the probe.
Apr  5 12:28:07 smokeping[25726]: CheckTcp: [addr2]: timeout (8 s) 
reached, killing the probe.
Apr  5 12:28:21 smokeping[25726]: CheckTcp: [addr3]: caught exception
Apr  5 12:28:30 smokeping[25726]: CheckTcp: [addr4]: timeout (8 s) 
reached, killing the probe.

All of this by itself is not necessarily unusual (the caught exception 
doesn't seem to be the killer; there are probes that do that then run 
again later).  The problem is that the probe activity simply stops.  I 
(Continue reading)

Matt Lawson | 5 Apr 20:17 2005

How to deal with non-result in pingone?

When I have a custom probe that uses the baseforks class, and I create 
a pingone method, what should the pingone method do if it gets to the 
end and hasn't been able to produce a result?

In most of the examples, the results (if any) are pushed into an array 
called  <at> times and then at the end of the function there is a:

return sort { $a <=> $b }  <at> times;

Or something similar.

However I seem to be having a problem in the case where the times array 
is empty.  Smokeping will run through 2 'rounds' of samples (e.g. 2 
seconds elapsed each time), but the third time through will hang for a 
very long time (e.g. 100 seconds) and then give the message about 
"WARNING: A single round took 100 seconds when it should have been 
10..."

If I ensure that the  <at> times array always has at least 1 value (even if 
it's zero) it seems to keep running the rounds in the proper amount of 
time, i.e. works as expected.

So.... what should I do with my empty  <at> times array?

System is FreeBsd 5.2.1 and smokeping is version 1.38

Thanks,
Matt

(Continue reading)

Mesbah | 6 Apr 05:38 2005

Re: changing interval

Hi Marc,
You can change number of ping and time intervel in smokeping.conf file -

*** Database ***
step     = 300
pings    = 20

----- Original Message ----- 
From: "MJonkers" <wemeelen <at> xs4all.nl>
To: <smokeping-users <at> list.ee.ethz.ch>
Sent: Wednesday, March 30, 2005 01:18
Subject: [smokeping-users] changing interval

> Hi,
> Is it possible to change the amount of pings and the time (300 seconds).
>
>
> Thanks,
>
> Marc
> --
> Unsubscribe
mailto:smokeping-users-request <at> list.ee.ethz.ch?subject=unsubscribe
> Help        mailto:smokeping-users-request <at> list.ee.ethz.ch?subject=help
> Archive     http://www.ee.ethz.ch/~slist/smokeping-users
> WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi
>

madzi | 6 Apr 10:20 2005
Picon

problem with smokeping.pid


im a newbie trying to install smokeping, got it running yesterday
but now i get this error...can anyone be of help please

matrix-sparc# /usr/local/etc/rc.d/smokeping.sh start
Use of uninitialized value in kill at /usr/local/smokeping/lib/Smokeping.pm 
line 1984.
Use of uninitialized value in concatenation (.) or string 
at /usr/local/smokeping/lib/Smokeping.pm line 1984.
ERROR: I Quit! Another copy of /usr/local/bin/smokeping () seems to be running.
       Check /usr/local/var/smokeping/smokeping.pid
matrix-sparc# /usr/local/etc/rc.d/smokeping.sh stop
kill: Too few arguments.

----------------------------------------------------------------------
This mail sent through Horde-Toaster (http://qmailtoaster.clikka.com/)

Matt Lawson | 6 Apr 17:55 2005

Re: How to deal with non-result in pingone?

Aah, it seems this all may be a red herring.

Turns out that one reason everything was so slow is our email program 
was taking 45s to process each mail request, causing the long delays.

Another factor is we upgraded from smokeping 1.31 to 1.38.

After that, returning empty  <at> times arrays from the probe didn't seem to 
be a problem anymore.  It's still early but it looks like this is where 
it's headed.

Cougar | 7 Apr 21:48 2005
Picon

small patch to let configure fping source IP


Hi,

I patched fping so that it is possible to use any configured IP as a 
source address using -I like ping has. To use it wit smokeping I made some 
small changes in smokeping (FPing module) also.

both fping and smokeping patches are available here:

http://wiki.version6.net/Patches

With this patch you can use variable "source" in FPing probe config.

One idea what would be even more useful is to have possibility to add this 
variable inside target. This way it would be possible to use probe that is 
connected with different nets, to test all of them separately. But is is 
not a easy task as all targets that use different source should be pinged 
separately and it means bigger rewrite. Something like this:

++ ISP1
menu = ISP1
title = ISP1
source = 1.1.1.1

+++ Google
menu = Google
title = Google
host = www.google.com

++ ISP2
(Continue reading)

Tony.Cetera | 12 Apr 15:27 2005
Picon

CiscoRTTMonEchoICMP caught exception

Hello,

I'm using smokeping 1.38 on Solaris 8 with Perl 5.8.0.  I have configured the CiscoRTTMonEchoICMP probe
with 50 forks and 16 targets.  All 16 tests run but every 5 minutes smokeping throws a few of these into the log:

smokeping[pid#]: CiscoRTTMonEchoICMP: {ip address}: caught exception

The failed targets are different every time. It appears that the targets who "catch the exception" report
packet loss for that interval.  Is there any way to determine what is causing the error?  Is there something
in my configuration that may be causing this?

Tony Cetera

John Hally | 12 Apr 21:23 2005

graph question


Hello all,

This is going to be a really dumb question, but can someone explain to me
the relevance of the light/dark gray entries on the smokeping graphs?  Is
this the variance over the 20 requests?

Thanks in advance!

Arnold Nipper | 13 Apr 01:07 2005
Picon

Re: graph question

On 12.04.2005 21:23 John Hally wrote

> Hello all,
> 
> This is going to be a really dumb question, but can someone explain to me
> the relevance of the light/dark gray entries on the smokeping graphs?  Is
> this the variance over the 20 requests?
> 

Actually it's not really the variance but all of the n values scattered 
around the median.

Arnold

Jason D. Hammerschmidt | 13 Apr 05:35 2005

Re: graph question


>> This is going to be a really dumb question, but can someone explain 
>> to me
>> the relevance of the light/dark gray entries on the smokeping graphs? 
>>  Is
>> this the variance over the 20 requests?
> Actually it's not really the variance but all of the n values scattered
> around the median.

Excellent question actually, but I am confused about the answer.  If 
it's variance around the median, why do so many of my graphs show grey 
lines skewed on only one side?  Usually upwards?  If it was around the 
median, should it be grey on both sides of the main coloured bar?  Or 
does that represent something different?


Gmane