Rune Tipsmark | 26 Feb 11:13 2015
Picon

[Check_mk (english)] Warn if free disk space or cpu idle is ABOVE certain level

hi guys,

one challenge we encounter is users requesting servers with too many resources and we would like warnings when for example free disk space is ABOVE 30% or cpu utilization is BELOW 50% .... so we can plan/resize our servers accordingly and not waste precious resources, in particular disk space, on servers that have no need.

 

Optimally a script to pull pnp4nagios size/usage data over a 30 or 90 or 180 day period and create warnings accordingly would be awesome... WARN: over 30% free on drive-D, average usage last 90 days: 20% / peak 22%... so we could reclaim resources as we go.... 

 

any ideas how this could be achieved? even a script to output this would be great... or input on how to write it?

 

br,

Rune 

_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Jonas Neumann | 26 Feb 10:29 2015
Picon

[Check_mk (english)] Windws: Ignore Memory Usage if one Process uses less than X Percent of Maximum Memory

Hello,
we have some exchange Servers that we monitor with CMK. The Process
"store.exe" uses a lot of RAM (as it is supposed to do). Is it
possible to only alert about the memory usage if a specific process
uses less than X Percent of the memory?
I did not find anything in the Mailinglist or on the CMK Web interface.

Has anyone achived something like this already?

Jonas
Cristopher Hermansson | 25 Feb 20:51 2015
Picon

[Check_mk (english)] Help with performance issue

Hi everyone!
Having som issues with performance on my monitor server. This is my setup
Hardware: Dell R410, quadcore with intel HT enabled, 48GB internal memory, 600GB SAS 15k hardrives in a RAID 1 configuration
Operatingsystem: Debian 7.6 (no “desktop" installed)
OMD: "omd-1.21.201501”
Hosts: ~200 (10 ICMP only, 5 SNMP only)
Total checks ~8000

Configured that the hosts with check_mk_agent is using the status från that check instead of ICMP. 

Not getting a 100% check… My CPU is about 6% so the CPU aint the bottleneck. Any advice on how to make check_mk to take advantage of the hardware?
What to tweak?

Best regards
Cristopher H.

_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Scott Shannon | 25 Feb 19:33 2015

[Check_mk (english)] Unable to place host and services into Downtime from command line

I'm hoping that someone else has run into this and has found the magic bit that needs to be changed.  

I'm attempting to write a script that will allow a host to put itself and its services into downtime when the
server is being booted deliberately.  We can go through the Web interface, but that gets cumbersome when
booting a lot of servers at the same time.

I have seen the following syntax in several places on Check_MK related websites.  (Note that our virus
scanner system will probably trash out the line containing the URL, directly following the curl
command.)   I have followed this syntax, and several variations, to no avail.  I can see that connection is
made with the server, I can add "&output_format=JSON" and get expected output, but the host does not get
put into downtime.

curl -n -s "http://<my_host>/<my_site>/check_mk/view.py?
   _do_confirm=yes
  &_transid=-1
  &_do_actions=yes
  &host=HOSTNAME
  &site=
  &view_name=host
  &_down_comment=COMMENT
  &_down_from_now=From+now+for
  &_down_minutes=DOWNMINUTES
  &_username=AUTOMATIONUSER
  &_secret=AUTOMATIONSECRET"

The UPPERCASE words are replaced with our site specific information.

Thank you ahead of time for showing me what I'm missing!

Scott Shannon

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further
distribution of confidential information is not permitted unless such privilege is explicitly granted
in writing by Quantum. Quantum reserves the right to have electronic communications, including email
and attachments, sent across its networks filtered through anti virus and spam software programs and
retain such messages in order to comply with applicable data security and retention requirements.
Quantum is not responsible for the proper and complete transmission of the substance of this
communication or for any delay in its receipt.
Egert Vero | 25 Feb 18:27 2015
Picon

[Check_mk (english)] Adding a new agent type in WATO

 Hi,

I use a custom wmi agent to  monitor some windows boxes since I can't install the check_mk agent by adding them manually in main.mk.

Is there anyway  i can add a new agent type in Wato  and use the custom agent? 

Thanks,
Egert

_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Scott Shannon | 25 Feb 17:06 2015

[Check_mk (english)] [abrt] a crash has been detected again

We are running Oracle Unbreakable Linux (a slightly modified RHEL).  Recently we ran an update, and on 4
servers in our environment it updated the kernel to 2.6.39-400.247.1.el6uek.x86_64 - a higher version
than any of our other servers .  

This is primarily an annoyance as our systems still run fine, but 2 or 3 times a day (at random times) we
receive emails similar to the one below.  I removed and reinstalled the client (1.2.4p5-1) but it made no
difference, as I expected.  I realize that this may be a red herring for something else, but the consistent
item between all the messages is the mention of the check_mk_agent.

I'm hoping that someone else has seen this and has found a solution.

Scott Shannon 

-----Original Message-----
From: ABRT Daemon
Sent: Wednesday, February 25, 2015 8:23 AM
To: root@...
Subject: [abrt] a crash has been detected again

abrt_version:   2.0.8
cgroup:         
cmdline:        /bin/bash /usr/bin/check_mk_agent
executable:     /bin/bash
hostname:       <server_name>
kernel:         2.6.39-400.247.1.el6uek.x86_64
last_occurrence: 1424877779
pid:            37347
pwd:            /
time:           Wed 18 Feb 2015 10:39:17 PM MST
uid:            0
username:       root

sosreport.tar.xz: Binary file, 1522040 bytes

dso_list:
:/lib64/ld-2.12.so glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280
:/usr/lib/locale/locale-archive glibc-common-2.12-1.149.el6_6.5.x86_64 (Oracle America)
1423849277 :/bin/bash bash-4.1.2-29.el6.0.1.x86_64 (Oracle America) 1423849273
:/usr/lib64/gconv/gconv-modules.cache glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280
:/lib64/libtinfo.so.5.7 ncurses-libs-5.7-3.20090208.el6.x86_64 (Oracle America) 1380789194
:/lib64/libnss_files-2.12.so glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280
:/usr/lib64/gconv/ISO8859-1.so glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280
:/lib64/libc-2.12.so glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280
:/lib64/libdl-2.12.so glibc-2.12-1.149.el6_6.5.x86_64 (Oracle America) 1423849280

environ:
:LC_MONETARY=en_US
:TERM=linux
:LC_NUMERIC=en_US
:LC_ALL=en_US
:PATH=/sbin:/usr/sbin:/bin:/usr/bin
:LC_MESSAGES=en_US
:runlevel=5
:RUNLEVEL=5
:LC_COLLATE=en_US
:LANGSH_SOURCED=1
:PWD=/
:LANG=en_US
:previous=N
:PREVLEVEL=N
:CONSOLETYPE=vt
:SHLVL=3
:UPSTART_INSTANCE=
:UPSTART_EVENTS=runlevel
:UPSTART_JOB=rc
:LC_TIME=en_US
:_=/usr/sbin/xinetd
:REMOTE_HOST=::ffff:10.50.33.133

limits:
:Limit                     Soft Limit           Hard Limit           Units     
:Max cpu time              unlimited            unlimited            seconds   
:Max file size             unlimited            unlimited            bytes     
:Max data size             unlimited            unlimited            bytes     
:Max stack size            8388608              unlimited            bytes     
:Max core file size        0                    unlimited            bytes     
:Max resident set          unlimited            unlimited            bytes     
:Max processes             515008               515008               processes 
:Max open files            4096                 4096                 files     
:Max locked memory         65536                65536                bytes     
:Max address space         unlimited            unlimited            bytes     
:Max file locks            unlimited            unlimited            locks     
:Max pending signals       515008               515008               signals   
:Max msgqueue size         819200               819200               bytes     
:Max nice priority         0                    0                    
:Max realtime priority     0                    0                    
:Max realtime timeout      unlimited            unlimited            us        

maps:
:00400000-004d4000 r-xp 00000000 08:03 1441845                            /bin/bash
:006d4000-006dd000 rw-p 000d4000 08:03 1441845                            /bin/bash
:006dd000-006e3000 rw-p 00000000 00:00 0 
:008dc000-008e5000 rw-p 000dc000 08:03 1441845                            /bin/bash
:01ef0000-01f11000 rw-p 00000000 00:00 0                                  [heap]
:3292000000-3292020000 r-xp 00000000 08:03 1310765                        /lib64/ld-2.12.so
:329221f000-3292220000 r--p 0001f000 08:03 1310765                        /lib64/ld-2.12.so
:3292220000-3292221000 rw-p 00020000 08:03 1310765                        /lib64/ld-2.12.so
:3292221000-3292222000 rw-p 00000000 00:00 0 
:3292400000-329258a000 r-xp 00000000 08:03 1310876                        /lib64/libc-2.12.so
:329258a000-329278a000 ---p 0018a000 08:03 1310876                        /lib64/libc-2.12.so
:329278a000-329278e000 r--p 0018a000 08:03 1310876                        /lib64/libc-2.12.so
:329278e000-329278f000 rw-p 0018e000 08:03 1310876                        /lib64/libc-2.12.so
:329278f000-3292794000 rw-p 00000000 00:00 0 
:3292c00000-3292c02000 r-xp 00000000 08:03 1310894                        /lib64/libdl-2.12.so
:3292c02000-3292e02000 ---p 00002000 08:03 1310894                        /lib64/libdl-2.12.so
:3292e02000-3292e03000 r--p 00002000 08:03 1310894                        /lib64/libdl-2.12.so
:3292e03000-3292e04000 rw-p 00003000 08:03 1310894                        /lib64/libdl-2.12.so
:32a3800000-32a381d000 r-xp 00000000 08:03 1310774                        /lib64/libtinfo.so.5.7
:32a381d000-32a3a1d000 ---p 0001d000 08:03 1310774                        /lib64/libtinfo.so.5.7
:32a3a1d000-32a3a21000 rw-p 0001d000 08:03 1310774                        /lib64/libtinfo.so.5.7
:7f1a29d11000-7f1a29d12000 r-xp 00000000 08:03 526495                     /usr/lib64/gconv/ISO8859-1.so
:7f1a29d12000-7f1a29f12000 ---p 00001000 08:03 526495                     /usr/lib64/gconv/ISO8859-1.so
:7f1a29f12000-7f1a29f13000 r--p 00001000 08:03 526495                     /usr/lib64/gconv/ISO8859-1.so
:7f1a29f13000-7f1a29f14000 rw-p 00002000 08:03 526495                     /usr/lib64/gconv/ISO8859-1.so
:7f1a29f14000-7f1a29f20000 r-xp 00000000 08:03 1317975                    /lib64/libnss_files-2.12.so
:7f1a29f20000-7f1a2a120000 ---p 0000c000 08:03 1317975                    /lib64/libnss_files-2.12.so
:7f1a2a120000-7f1a2a121000 r--p 0000c000 08:03 1317975                    /lib64/libnss_files-2.12.so
:7f1a2a121000-7f1a2a122000 rw-p 0000d000 08:03 1317975                    /lib64/libnss_files-2.12.so
:7f1a2a122000-7f1a2ffb3000 r--p 00000000 08:03 526620                     /usr/lib/locale/locale-archive
:7f1a2ffb3000-7f1a2ffb6000 rw-p 00000000 00:00 0
:7f1a2ffc8000-7f1a2ffca000 rw-p 00000000 00:00 0 
:7f1a2ffca000-7f1a2ffd1000 r--s 00000000 08:03 526626                     /usr/lib64/gconv/gconv-modules.cache
:7f1a2ffd1000-7f1a2ffd2000 rw-p 00000000 00:00 0 
:7fffd8c7b000-7fffd8c9c000 rw-p 00000000 00:00 0                          [stack]
:7fffd8dff000-7fffd8e00000 r-xp 00000000 00:00 0                          [vdso]
:ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

open_fds:
:0:/dev/null
:pos:	0
:flags:	0100000
:1:socket:[4577099]
:pos:	0
:flags:	02
:2:/dev/null
:pos:	0
:flags:	0100001

var_log_messages:
:Feb 16 08:34:15 <server_name> kernel: check_mk_agent[5820] general protection ip:44f019
sp:7fffb10ec700 error:0 in bash[400000+d4000] :Feb 16 08:34:15 pplerpap2 abrt[5886]: Saved core dump
of pid 5820 (/bin/bash) to /var/spool/abrt/ccpp-2015-02-16-08:34:15-5820 (495616 bytes) :Feb 17
02:57:15 <server_name> abrt[8618]: Saved core dump of pid 8578 (/bin/bash) to
/var/spool/abrt/ccpp-2015-02-17-02:57:15-8578 (507904 bytes)

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further
distribution of confidential information is not permitted unless such privilege is explicitly granted
in writing by Quantum. Quantum reserves the right to have electronic communications, including email
and attachments, sent across its networks filtered through anti virus and spam software programs and
retain such messages in order to comply with applicable data security and retention requirements.
Quantum is not responsible for the proper and complete transmission of the substance of this
communication or for any delay in its receipt.
Alan Murrell | 25 Feb 10:18 2015
Picon

[Check_mk (english)] check_mk multisite (OMD): Overview of local sites?

Hello,

I am wondering if, with an OMD install, if one creates multiple sites, 
is there a way to have an overview of all those sites?  Could this be 
done via the "distributed monitoring" config, except specifying the 
local server/sites?

My use case is this: I have a client who has one big site and a few 
smaller, "satellite" offices, connected to the main site by VPN.  None 
of the "satellite" offices are big enough to justify their own OMD 
installations, so I plan on monitoring them via the OMD install at the 
main site, but I would still like to keep each location seperate.

So I am thinking something like this:

   - Create a site for each location ("site1", "site2", "site3", etc.)
   - Setup monitoring and access for each site's devices under the 
appropriate site
   - Create another site with a name for the organisation itself
   - Under the "Organisation Site", consolidate the various locations 
into an overall view/management (possibly using the distributed 
monitoring config)

Is the above feasible?  If not, it is not the end of the world; I can 
just create the devices for each location under a single site and put 
the site name as part of the device name (e.g., "[site1] Switch 1"), but 
if it can be done as individual sites, that would be preferred :-)

Thanks! :-)

Regards,

Alan
Ralf Prengel | 25 Feb 10:09 2015
Picon

[Check_mk (english)] reports for customers with selected services


Hallo,

I m using truk to generate reports.
My problem:
I m looking for a solution to generate reports with only some selected  
servcies and host informations  for customers.
The customers doesn t need for example every single database-check.

Any ideas or hints

Ralf
Jolyon Brown | 24 Feb 17:07 2015

[Check_mk (english)] Postgres Monitoring

Hi All

Another question from me I'm afraid. Amongst the community here, what is the most popular way of monitoring Postgresql databases? I'm looking at the mk_postgres plugin but wondering what other plugins people are using, if any. 

Cheers

Jolyon

Limilo Ltd

Registered in England no: 07778174

Registered office: Netpark Incubator, Thomas Wright Way, Sedgefield, TS21 3FD

VAT no: 190425515

_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Kim Mount | 24 Feb 09:42 2015
Picon

[Check_mk (english)] HP ILO Monitoring

Hi there,

Hope everyone is well.

I'm just wondering what the current best way of monitoring HP ILO3/4 is?

It can send SNMP alerts but from what I can see it's not walkable which is a shame.

I am wondering whether accessing it from within the OS is the best option or whether writing something that will talk to it over the ssh console makes the most sense.

What's everyone else doing these days? I'm looking to target things like unplugged PSUs, failed disks etc.

Many thanks for your thoughts.
Kim
_______________________________________________
checkmk-en mailing list
checkmk-en@...
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Stephen Berg (Contractor | 23 Feb 19:13 2015
Picon

[Check_mk (english)] Monitoring dovecot processes

I'm trying to set up a check to monitor the number of dovecot processes 
on an imap server.  The ps output from the command line shows 
"dovecot/imap", "dovecot/imap-login" and occasionally "dovecot/auth -w" 
processes.  Having difficulty finding the right regex so all the dovecot 
processes will be included in this on check_ps service.  Anyone set that 
up and want to give me a hint?

I've tried variations of this and it only seems to find the specific 
named processes, I'd like it to find anything with "dovecot" in the 
process list.

( ["dovecot"], ALL_HOSTS, "ps", "dovecot", ("dovecot/imap", 1, 1, 128, 
128) ) ,

--

-- 
Stephen Berg
Systems Administrator
NRL Code: 7320
Office: 228-688-5738
stephen.berg.ctr@...

Gmane