Re: [opennms-discuss] 1.3.2 -- Events stop after 1 hour
Paul Mona <pmona <at> commpartners.us>
2007-02-01 13:39:38 GMT
I have loaded our production DB on our 1.3.2 box and am again running
out of memory and seeing the threads crash. How much memory will this
take? I would think a dual 3G CPU box with 4G of RAM would be plenty
for this app. Except for Tomcat and postgresql there is nothing else on
this box.
In the output.log there are lots of lines like these:
20333.918: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699072K->699072K(699072K)] 815616K->815616K(932096K) [PSPermGen:
39634K->39634K(39936K)] GC time is exceeding GCTimeLimit of 98%
And output from start and crash:
------------------------------------------------------------------------
------
Fri Jan 26 09:23:43 PST 2007
begin ulimit settings:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 1024
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 10240
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 36864
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
end ulimit settings
Executing command: /opt/opennms/bin/runjava -r --
-Dopennms.home=/opt/opennms -Xmx1024m -XX:+PrintGCTimeStamps
-XX:+PrintGCDetails -jar /opt/opennms/li
b/opennms_bootstrap.jar start
938.225: [GC [PSYoungGen: 110976K->4049K(122496K)]
199261K->92335K(226816K), 0.0218470 secs]
952.125: [GC [PSYoungGen: 115345K->4715K(116032K)]
203631K->93000K(220352K), 0.0239260 secs]
983.120: [GC [PSYoungGen: 87948K->4325K(121408K)]
176234K->92611K(225728K), 0.0265640 secs]
983.147: [Full GC[Unloading class
sun.reflect.GeneratedConstructorAccessor15]
[Unloading class sun.reflect.GeneratedConstructorAccessor26]
[Unloading class sun.reflect.GeneratedConstructorAccessor25]
[Unloading class sun.reflect.GeneratedMethodAccessor2]
[Unloading class sun.reflect.GeneratedConstructorAccessor17]
[Unloading class sun.reflect.GeneratedMethodAccessor4]
[Unloading class sun.reflect.GeneratedMethodAccessor5]
[Unloading class sun.reflect.GeneratedConstructorAccessor14]
[Unloading class sun.reflect.GeneratedConstructorAccessor20]
[Unloading class sun.reflect.GeneratedMethodAccessor9]
[Unloading class sun.reflect.GeneratedConstructorAccessor19]
[Unloading class sun.reflect.GeneratedConstructorAccessor22]
[Unloading class sun.reflect.GeneratedConstructorAccessor23]
[Unloading class sun.reflect.GeneratedConstructorAccessor16]
[Unloading class sun.reflect.GeneratedMethodAccessor6]
[Unloading class sun.reflect.GeneratedConstructorAccessor24]
[Unloading class sun.reflect.GeneratedConstructorAccessor21]
[Unloading class sun.reflect.GeneratedConstructorAccessor18]
[Unloading class sun.reflect.GeneratedMethodAccessor3]
[PSYoungGen: 4325K->0K(121408K)] [PSOldGen: 88285K->85744K(104320K)]
92611K->85744K(225728K) [PSPermGen: 37294K->37294K(40000K)], 0.4306280
secs]
1019.526: [GC [PSYoungGen: 111680K->5696K(121920K)]
197424K->91440K(226240K), 0.0227150 secs]
1043.588: [GC [PSYoungGen: 33395K->2512K(121792K)]
119140K->88256K(226112K), 0.0123810 secs]
1043.600: [Full GC[Unloading class
sun.reflect.GeneratedConstructorAccessor41]
[Unloading class sun.reflect.GeneratedConstructorAccessor43]
[Unloading class sun.reflect.GeneratedConstructorAccessor40]
12252.082: [Full GC [PSYoungGen: 116543K->116383K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815455K(932096K) [PSPermGen:
40050K->40050K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.3448440 secs]
Exception in thread "default" java.lang.OutOfMemoryError: Java heap
space
12254.433: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40051K->40051K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1694680 secs]
12256.628: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40051K->40051K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1674210 secs]
12258.805: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40051K->40051K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1695010 secs]
12260.979: [Full GC [PSYoungGen: 116544K->116380K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815451K(932096K) [PSPermGen:
40051K->40051K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.9448990 secs]
Exception in thread "Thread-116" java.lang.OutOfMemoryError: Java heap
space
Exception in thread "CollectdScheduler-50 Pool"
java.lang.OutOfMemoryError: Java heap space
Exception in thread "SnmpPortal--1-FastReceiver"
java.lang.OutOfMemoryError: Java heap space
Exception in thread "TrapQueueProcessor" java.lang.OutOfMemoryError:
Java heap space
12263.966: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.2081590 secs]
12266.186: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1736410 secs]
12268.369: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1619620 secs]
12270.534: [Full GC [PSYoungGen: 116544K->116398K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815469K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.9710520 secs]
Exception in thread "Event TCP Server[5817]" java.lang.OutOfMemoryError:
Java heap space
12273.531: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1589000 secs]
12275.700: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1691140 secs]
12277.873: [Full GC [PSYoungGen: 116544K->116544K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->815615K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.1683000 secs]
12280.055: [Full GC [PSYoungGen: 116544K->115736K(233024K)] [PSOldGen:
699071K->699071K(699072K)] 815615K->814807K(932096K) [PSPermGen:
40056K->40056K(400
64K)] GC time is exceeding GCTimeLimit of 98%
, 2.9479640 secs]
Exception in thread "UDP Event Receiver[5817]"
java.lang.OutOfMemoryError: Java heap space
12283.022: [Full GC [PSYoungGen: 116544K->0K(233024K)] [PSOldGen:
699071K->177995K(540288K)] 815615K->177995K(773312K) [PSPermGen:
40057K->40057K(40064K)]
, 1.2057900 secs]
Exception in thread "SnmpPortal--1" java.lang.OutOfMemoryError: Java
heap space
12344.228: [GC [PSYoungGen: 68829K->2138K(233024K)]
246824K->180133K(773312K), 0.0178460 secs]
12344.246: [Full GC [PSYoungGen: 2138K->0K(233024K)] [PSOldGen:
177995K->180075K(540288K)] 180133K->180075K(773312K) [PSPermGen:
40101K->40101K(40448K)],
1.1005890 secs]
12357.467: [GC [PSYoungGen: 116544K->103456K(233024K)]
296619K->283531K(773312K), 0.7651500 secs]
12367.744: [GC [PSYoungGen: 220000K->116448K(233024K)]
400075K->393411K(773312K), 1.2769990 secs]
12379.783: [GC [PSYoungGen: 232992K->116448K(233024K)]
509955K->503707K(773312K), 1.2442670 secs]
12391.858: [GC [PSYoungGen: 232992K->116448K(233024K)]
620251K->622211K(773312K), 0.9171710 secs]
12392.775: [Full GC [PSYoungGen: 116448K->80124K(233024K)] [PSOldGen:
505763K->540287K(699072K)] 622211K->620412K(932096K) [PSPermGen:
40108K->40108K(4044
8K)], 2.0850330 secs]
12398.214: [GC [PSYoungGen: 116544K->115776K(233024K)]
656831K->656063K(932096K), 0.5813200 secs]
12409.178: [GC [PSYoungGen: 232320K->116448K(233024K)]
772607K->768287K(932096K), 0.9148160 secs]
12410.093: [Full GC [PSYoungGen: 116448K->57789K(233024K)] [PSOldGen:
651839K->699071K(699072K)] 768287K->756861K(932096K) [PSPermGen:
40110K->40100K(4038
4K)], 2.9382200 secs]
|-----Original Message-----
|From: Paul Mona
|Sent: Saturday, January 27, 2007 4:35 AM
|To: 'opennms-discuss <at> lists.sourceforge.net'
|Subject: Re: 1.3.2 -- Events stop after 1 hour
|
||On Jan 26, 2007, at 6:48 PM, Paul Mona wrote:
||
||>> How many RRD files do you have?
||>
||> There are 100s of directories both in /response and /snmp
||
||Send us some output from:
|| tail -f uncategorized.log |grep '^QS'
||
|
|[root <at> las-eng-test ~]# tail -f
|/opt/opennms/logs/uncategorized.log |grep '^QS'
|QS: totalOperationsPending=107, significantOpsPending=5,
|filesWithSignificantWork=5, filesWithInsignificantWork=102
|QS: , createsCompleted=0, udpatesCompleted=980000,
|errors=290, promotionRate=0.0, promotionCount=0
|QS: , currentEnqueueRates=(22/7/30),
|currentDequeueRate=(22/7/29), currentItemDequeRate=29,
|currentOpsPerUpdate=1.0,
|currentPrcntSignificant=73.90659014446864%, elapsedTime=337
|QS: , overallEnqueueRate=(16/5/22),
|overallDequeueRate=(16/5/22), overallItemDequeRate=22,
|overallOpsPerUpdate=1.0001611998163547,
|overallPrcntSignificant=75.15042099519081%,
|totalElapsedTime=44362 [127.976 sec]
|QS: totalOperationsPending=117, significantOpsPending=58,
|filesWithSignificantWork=58, filesWithInsignificantWork=59
|QS: , createsCompleted=0, udpatesCompleted=990000,
|errors=291, promotionRate=0.0, promotionCount=0
|QS: , currentEnqueueRates=(21/7/29),
|currentDequeueRate=(21/8/29), currentItemDequeRate=29,
|currentOpsPerUpdate=1.0002000200020003,
|currentPrcntSignificant=72.69004095494955%, elapsedTime=343
|QS: , overallEnqueueRate=(16/5/22),
|overallDequeueRate=(16/5/22), overallItemDequeRate=22,
|overallOpsPerUpdate=1.0001615918412279,
|overallPrcntSignificant=75.1255771500575%,
|totalElapsedTime=44705 [170.299 sec]
|
|
|
||How may CPUs?
||Are you running in 64-bit Java and are you running OpenNMS in 64-bit
||mode?
|
|The box has dual 3Gig CPUs. Yes - 64-bit
|
||
||> (BTW -- The file bin/opennms still has all the paths set to /opt/
||> OpenNMS and not /opt/opennms -- My SVN is about 1 week old),
||
||You mentioned 1 week old SVN so, that isn't the 1.3.2 release as the
||subject would suggest, and you are probably building with the -
||Dopennms.home=/opt/OpenNMS property.
||
|
|Originally I was running the 64-bit rpm, but built it last
|week from SVN thinking that something may have been addressed
|in a later build.
|That said, the problem I have been experiencing happens in both.
|
|
||
||> my system seems to be running well. After about 6 hours of run
||> time, this is the last little bit from the output.log -- I have
||> no idea what it is telling me though.
||>
||> 22606.489: [GC [PSYoungGen: 342157K->1761K(340672K)] 462910K-
||> >122514K(581824K), 0.0079570 secs]
||
||Your HEAP size, at this point, is 922,496K. Your QS stats will be
||interesting. Another interesting tool to analyze heap is:
||
||sudo $JAVA_HOME/bin/jmap -heap `cat /opt/opennms/logs/opennms.pid`
||
|
|[root <at> las-eng-test ~]# sudo /usr/java/jdk/bin/jmap -heap `cat
|/opt/opennms/logs/opennms.pid`
|Attaching to process ID 22913, please wait...
|Debugger attached successfully.
|Server compiler detected.
|JVM version is 1.5.0_10-b03
|
|using thread-local object allocation.
|Parallel GC with 4 thread(s)
|
|Heap Configuration:
| MinHeapFreeRatio = 40
| MaxHeapFreeRatio = 70
| MaxHeapSize = 1073741824 (1024.0MB)
| NewSize = 2686976 (2.5625MB)
| MaxNewSize = -65536 (-0.0625MB)
| OldSize = 1835008 (1.75MB)
| NewRatio = 2
| SurvivorRatio = 8
| PermSize = 21757952 (20.75MB)
| MaxPermSize = 88080384 (84.0MB)
|
|Heap Usage:
|PS Young Generation
|Eden Space:
| capacity = 342687744 (326.8125MB)
| used = 337463664 (321.83042907714844MB)
| free = 5224080 (4.9820709228515625MB)
| 98.47555680310528% used
|From Space:
| capacity = 7340032 (7.0MB)
| used = 4008688 (3.8229827880859375MB)
| free = 3331344 (3.1770172119140625MB)
| 54.614039829799104% used
|To Space:
| capacity = 7077888 (6.75MB)
| used = 0 (0.0MB)
| free = 7077888 (6.75MB)
| 0.0% used
|PS Old Generation
| capacity = 252313600 (240.625MB)
| used = 120093936 (114.53050231933594MB)
| free = 132219664 (126.09449768066406MB)
| 47.59709187297078% used
|PS Perm Generation
| capacity = 41025536 (39.125MB)
| used = 40704552 (38.818885803222656MB)
| free = 320984 (0.30611419677734375MB)
| 99.2175994970547% used
|
||
||
||So, is your system still running?
||
|
|Yes, still running
|
||-David
||
||
||David Hustace
||The OpenNMS Group, Inc.
||Main : +1 919 533 0160 Fax: +1 503-961-7746
||Direct: +1 919 827 1201 Skype: dhustace
||Key Fingerprint: 8EC1 F319 2C1D 3197 AAB9 580F FACD 1D19 175E D903
||
|
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Please read the OpenNMS Mailing List FAQ:
http://www.opennms.org/index.php/Mailing_List_FAQ
opennms-discuss mailing list
To *unsubscribe* or change your subscription options, see the bottom of this page:
https://lists.sourceforge.net/lists/listinfo/opennms-discuss