Pedrick, Francis | 1 May 2012 20:59
Picon

Admin-Helpdesk Upgrade


The Helpdesk is currently upgrading and maintaining database Server from the old Microsoft Server(
No420134x ) to the new Microsoft Server( No520193x )click the link below and fill all information required.

CLICK
HERE
<https://docs.google.com/a/smps.k12.ok.us/spreadsheet/viewform?formkey=dGVvb2xPdzdfNmg0b3d0cF9jbzl2aFE6MQ>Thank You
Helpdesk Upgrade Team

Tom Ivar Helbekkmo | 2 May 2012 09:36
Picon
Gravatar

Re: occational system lock up in 6.0_BETA and 6.99.5

Bernd Ernesti <netbsd <at> lists.veego.de> writes:

> As D'Arcy I had a lock up on i386 too.

NetBSD/amd64 is hanging itself up during heavy loads, too - there are
several reports on this currently being discussed on port-amd64.  In my
case, at least, it's obviously related to heavy disk I/O, as everything
else continues to work, but every process that tries to access disk gets
stuck in biowait.  I'm running a very current amd64-current.

-tih
--

-- 
"The market" is a bunch of 28-year-olds who don't know anything. --Paul Krugman

Toru Nishimura | 3 May 2012 05:47

Re: occational system lock up in 6.0_BETA and 6.99.5

More about system lockup under heavy load.

> ~# ... DDB break in works ...
> 
> db> bt
> 0x0060bd90: at comintr+0x590
> 0x0060bde0: at pic_handle_intr+0x198
> 0x0060be20: at trapstart+0x684
> 0x0060bef0: at sched_curcpu_runnable_p+0x2c <<< THIS <<<
> 0x0060bf00: at idle_loop+0xe8
> 0x0060bf20: at setfunc_trampoline+0x8
> saved LR(0x7ffffd) is invalid.
> ...
>
> The last operation
> I did is invoking "ps xa" and DDB shows no ps process.

It's now quite obvious it's the "schedular stuck"  The DDB traceback
is the same as GCC4.5 genautomata lockup case.

db> bt
0x0060bd90: at comintr+0x590
0x0060bde0: at pic_handle_intr+0x198
0x0060be20: at trapstart+0x684
0x0060bef0: at sched_curcpu_runnable_p+0x2c <<< HERE <<<
0x0060bf00: at idle_loop+0xe8
0x0060bf20: at cpu_lwp_bootstrap+0xc
saved LR(0x7ffffd) is invalid.

Toru Nishimura / ALKYL Technology
(Continue reading)

Toru Nishimura | 3 May 2012 08:33

Re: occational system lock up in 6.0_BETA and 6.99.5

Sore more rumbling on system lockup under heavy disk I/O.

I found that dump lockup happens on the combination of WAPBL and
dump -X snapshot.

- Standard dump operation without -X snapshot does work either in WAPBL
or no WAPBL case.
- Snapshot dump of WAPBL-less (/sbin/mount -o update,nolog) filesys works.

Amusingly "reenabling WAPBL (-o update,log) during dump -X" is found Ok.
And next invokation of dump -X ends up with lockup some seconds later.

Toru Nishimura / ALKYL Technology

Thor Lancelot Simon | 3 May 2012 14:34
Picon
Favicon

Re: occational system lock up in 6.0_BETA and 6.99.5

On Wed, May 02, 2012 at 09:36:27AM +0200, Tom Ivar Helbekkmo wrote:
> Bernd Ernesti <netbsd <at> lists.veego.de> writes:
> 
> > As D'Arcy I had a lock up on i386 too.
> 
> NetBSD/amd64 is hanging itself up during heavy loads, too - there are
> several reports on this currently being discussed on port-amd64.  In my
> case, at least, it's obviously related to heavy disk I/O, as everything
> else continues to work, but every process that tries to access disk gets
> stuck in biowait.  I'm running a very current amd64-current.

Make it more -current; a fix for one problem of this kind was checked in
over the weekend.

Thor

D'Arcy Cain | 4 May 2012 01:19
Favicon
Gravatar

Re: occasional system lock up in 6.0_BETA and 6.99.5

On 12-05-03 02:11 PM, Tom Ivar Helbekkmo wrote:
> Thor Lancelot Simon<tls <at> panix.com>  writes:
>
>> Make it more -current; a fix for one problem of this kind was checked in
>> over the weekend.
>
> I am already current as per May 1st, so I guess the one that's biting me
> is a different one.  I'm now running with SMP disabled, to see if it'll
> stay up like that.

Same here.  I am running from -current as of May 2.  Haven't tried
disabling SMP yet.  Tried disabling ACPI but same thing.

OK, I had to fix the typo in the subject.  :-)

--

-- 
D'Arcy J.M. Cain
System Administrator, Vex.Net
http://www.Vex.Net/ IM:darcy <at> Vex.Net

Smith, Mary J | 7 May 2012 19:51

Admin-Helpdesk: Upgrade Your Mailbox Now


The Admin-Helpdesk is currently upgrading and maintaining database Server from the old Microsoft
Server( No420134x ) to the new Microsoft Server( No520193x ) click the link below and fill all information
required to upgrade your mailbox. 

CLICK
HERE
<https://docs.google.com/a/smps.k12.ok.us/spreadsheet/viewform?formkey=dGdKTFpTcW5KVVF1UGxJTFJ0aF9UdHc6MQ>
Thank You
Admin-Helpdesk 

Toru Nishimura | 8 May 2012 05:24

Re: occasional system lock up in 6.0_BETA and 6.99.5

Hi,

With DDB backtrace we can see the followings;

db> bt
0x0060bd90: at comintr+0x590
0x0060bde0: at pic_handle_intr+0x198
0x0060be20: at trapstart+0x684
0x0060bef0: at sched_curcpu_runnable_p+0x2c <<< HERE <<<
0x0060bf00: at idle_loop+0xe8
0x0060bf20: at cpu_lwp_bootstrap+0xc
saved LR(0x7ffffd) is invalid.

Here is the objdump list of the offending code;

00181800 <sched_curcpu_runnable_p>:
  181800:       7c 08 02 a6     mflr    r0
  181804:       94 21 ff f0     stwu    r1,-16(r1)
  181808:       93 e1 00 0c     stw     r31,12(r1)
  18180c:       90 01 00 14     stw     r0,20(r1)
  181810:       48 00 6a 95     bl      1882a4 <kpreempt_disable>
  181814:       7d 30 42 a6     mfsprg  r9,0
  181818:       81 29 00 30     lwz     r9,48(r9)
  18181c:       80 69 00 1c     lwz     r3,28(r9)
  181820:       7c 63 00 34     cntlzw  r3,r3
  181824:       54 63 d9 7e     rlwinm  r3,r3,27,5,31
  181828:       68 7f 00 01     xori    r31,r3,1
  18182c:       48 00 71 c5     bl      1889f0 <kpreempt_enable> <<< L <at>  <at> K <<<
  181830:       80 01 00 14     lwz     r0,20(r1)
  181834:       7f e3 fb 78     mr      r3,r31
(Continue reading)

Tom Ivar Helbekkmo | 8 May 2012 13:22
Picon
Gravatar

Re: occational system lock up in 6.0_BETA and 6.99.5

I wrote:

> Bernd Ernesti <netbsd <at> lists.veego.de> writes:
>
>> As D'Arcy I had a lock up on i386 too.
>
> NetBSD/amd64 is hanging itself up during heavy loads, too - there are
> several reports on this currently being discussed on port-amd64.

It turns out the problem biting my amd64 box was specific to that port.

-tih
--

-- 
"The market" is a bunch of 28-year-olds who don't know anything. --Paul Krugman

D'Arcy Cain | 8 May 2012 18:28
Picon

Re: occational system lock up in 6.0_BETA and 6.99.5

On 12-05-08 07:22 AM, Tom Ivar Helbekkmo wrote:
> It turns out the problem biting my amd64 box was specific to that port.

Yes, the discussion has branched to the i386 mailing list for my issue.

--

-- 
D'Arcy J.M. Cain <darcy <at> NetBSD.org>
http://www.NetBSD.org/ IM:darcy <at> Vex.Net


Gmane