Izumi Tsutsui | 4 Jun 13:54 2007
Picon

Re: Qube2 "crash" every few days during the daily script

I wrote:

> Looks MI ksyms(4) problem, but I'll take a look at it later.

I have not tracked this yet, but this doesn't happen
with a recent (20070601) sources, it seems.
(the same test is still running ~20 hours)

Could you try the recent kernel (with old userland)?
ftp://ftp.netbsd.org/pub/NetBSD-daily/HEAD/200706030002Z/cobalt/binary/kernel/

---
Izumi Tsutsui

Rémi Zara | 5 Jun 22:39 2007
Picon

Re: Qube2 "crash" every few days during the daily script


Le 4 juin 07 à 13:54, Izumi Tsutsui a écrit :

> I wrote:
>
>> Looks MI ksyms(4) problem, but I'll take a look at it later.
>
> I have not tracked this yet, but this doesn't happen
> with a recent (20070601) sources, it seems.
> (the same test is still running ~20 hours)
>
> Could you try the recent kernel (with old userland)?
> ftp://ftp.netbsd.org/pub/NetBSD-daily/HEAD/200706030002Z/cobalt/ 
> binary/kernel/


The Qube2 crashed again with the kernel you sent me:

Jun  5 22:21:26 qube2 /netbsd: trap: TLB miss (load or instr. fetch)  
in kernel mode
Jun  5 22:21:26 qube2 /netbsd: status=0x3, aause=0x8, epc=0x801d3fdc,  
vaddr=0x7ff547cb
Jun  5 22:21:26 qube2 /netbsd: pid=23103 cmd=netstat usp=0x7fffd4c0  
ksp=0xcc735c2?
Jun  5 22:21:26 qube2 /netbsd: Copyright (c) 1996, 1997, 1998, 1999,  
2000, 2001, 2002, 2003, 2004, 2005,
Jun  5 22:21:26 qube2 /netbsd: 2006, 2007
Jun  5 22:21:26 qube2 /netbsd: The NetBSD Foundation, Inc.  All  
rights reserved.
Jun  5 22:21:26 qube2 /netbsd: Copyright (c) 1982, 1986, 1989, 1991,  
(Continue reading)

Rémi Zara | 9 Jun 19:05 2007
Picon

Qube2 serial console from Quadra 800 (mac68k)

Hi,

I tried, without success, to hook up a Quadra 800 (mac68k, netbsd  
2.0) to a Qube2 as a serial console, without success.

I used a cable I had from an old modem, with a DB9 end and a DIN8 end  
to connect the two machines.

I tried printer and modem port, with the following remote definitions:

qube2:dv=/dev/dty01:br#115200:pa=none:

qube2:dv =/dev/dty01:br#9600:pa=none:dc:

I tried with tty01 also.

I did activate the serial console on the Qube, and went through  
multiple power cycle.

Anyone has an idea to make this work ? Is it the conf, or is my cable  
not suitable ?

Regards,

Rémi Zara
Attachment (smime.p7s): application/pkcs7-signature, 3267 bytes
Izumi Tsutsui | 9 Jun 19:10 2007
Picon

Re: Qube2 serial console from Quadra 800 (mac68k)

remi_zara <at> mac.com wrote:

> I used a cable I had from an old modem, with a DB9 end and a DIN8 end  
> to connect the two machines.

You need a cross (null-modem) cable.
See "NetBSD Serial Port Primer" page:
http://www.netbsd.org/docs/Hardware/Misc/serial.html
for details. (though there is no DB9-DIN8 null modem pinout)
---
Izumi Tsutsui

Rémi Zara | 10 Jun 09:53 2007
Picon

Re: Qube2 serial console from Quadra 800 (mac68k)


Le 9 juin 07 à 19:10, Izumi Tsutsui a écrit :

> remi_zara <at> mac.com wrote:
>
>> I used a cable I had from an old modem, with a DB9 end and a DIN8 end
>> to connect the two machines.
>
> You need a cross (null-modem) cable.

Ho, forgot about the null-modem thing...

> See "NetBSD Serial Port Primer" page:
> http://www.netbsd.org/docs/Hardware/Misc/serial.html

Lots of good info there. Thanks for the pointer.

> for details. (though there is no DB9-DIN8 null modem pinout)

There is a specialized store here in Paris that sells one for 22€
(http://ssl.conector.fr/html/produit.phtml?rub=CORDONS&srub=CORDONS 
+TRANSFERT+DE+FICHIERS+NULL+MODEM).
Ouch !

Thanks a lot,

Rémi Zara

Attachment (smime.p7s): application/pkcs7-signature, 3267 bytes
(Continue reading)

Michael L. Hitch | 14 Jun 19:54 2007

Re: yamt-idlelwp fallout for mips/cobalt?

On Fri, 25 May 2007, Izumi Tsutsui wrote:

> The following patch makes a LOCKDEBUG kernel work,
> but I don't know if it's really correct.

   I don't think it's correct.  This was changed several years ago
(starting with revision 1.175) because kernel threads would start
and run with interrupts disabled.  I had the same problem with my
amiga (m68k) not all that long ago because I had added a raidframe
drive and was getting lots of clock skew when parity rebuilding was
going on.  I finally figured out that the raidframe thread was running
with interrupts blocked, and started looking at several other ports
to see how they started kthread processes and found that they had
the same problem, but had fixed for several years.

   Looking back in the mail archives, this seems to be an attempt
to fix a locking against myself panic, so I suspect it's more likely
a locking error somewhere.  I remember that the m68k port had a similar
problem when running a DIAGNOSTIC kernel (which I had not tested at
the time), and I tracked down a small section of code I had missed
for the idlelwp changes.

> Index: arch/mips/mips/vm_machdep.c
> ===================================================================
> RCS file: /cvsroot/src/sys/arch/mips/mips/vm_machdep.c,v
> retrieving revision 1.117
> diff -u -r1.117 vm_machdep.c
> --- arch/mips/mips/vm_machdep.c	17 May 2007 14:51:25 -0000	1.117
> +++ arch/mips/mips/vm_machdep.c	25 May 2007 14:47:42 -0000
>  <at>  <at>  -170,7 +170,9  <at>  <at> 
(Continue reading)

Andrew Doran | 14 Jun 22:37 2007
Picon

Re: yamt-idlelwp fallout for mips/cobalt?

On Thu, Jun 14, 2007 at 11:54:39AM -0600, Michael L. Hitch wrote:

> On Fri, 25 May 2007, Izumi Tsutsui wrote:
> 
> >The following patch makes a LOCKDEBUG kernel work,
> >but I don't know if it's really correct.
> 
>   I don't think it's correct.  This was changed several years ago
> (starting with revision 1.175) because kernel threads would start
> and run with interrupts disabled.  I had the same problem with my
> amiga (m68k) not all that long ago because I had added a raidframe
> drive and was getting lots of clock skew when parity rebuilding was
> going on.  I finally figured out that the raidframe thread was running
> with interrupts blocked, and started looking at several other ports
> to see how they started kthread processes and found that they had
> the same problem, but had fixed for several years.

> >Index: arch/mips/mips/vm_machdep.c
> >===================================================================
> >RCS file: /cvsroot/src/sys/arch/mips/mips/vm_machdep.c,v
> >retrieving revision 1.117
> >diff -u -r1.117 vm_machdep.c
> >--- arch/mips/mips/vm_machdep.c	17 May 2007 14:51:25 -0000	1.117
> >+++ arch/mips/mips/vm_machdep.c	25 May 2007 14:47:42 -0000
> > <at>  <at>  -170,7 +170,9  <at>  <at> 
> >	pcb->pcb_context[MIPS_CURLWP_CARD - 16] = (intptr_t)l2;/* S? */
> >	pcb->pcb_context[8] = (intptr_t)f;		/* SP */
> >	pcb->pcb_context[10] = (intptr_t)lwp_trampoline;/* RA */
> >+#if 0
> >	pcb->pcb_context[11] |= PSL_LOWIPL;		/* SR */
(Continue reading)

Michael L. Hitch | 14 Jun 22:56 2007

Re: yamt-idlelwp fallout for mips/cobalt?

On Thu, 14 Jun 2007, Andrew Doran wrote:

> I'm not yet sure what we need to set into SR in this case, but post
> yamt-idlelwp, LWPs should start up at IPL_SCHED as far as MD code is
> concerned. cpu_switchto() is expected to maintain the IPL at IPL_SCHED or
> above across the switch. If I read the code correctly, PSL_LOWIPL enables
> all interrupts.

   Ah - I just looked at the alpha changes, and the code to set the IPL
level for the new lwp was removed, leaving it what was copied from the
parent.  I thought I had found something simllar in the i386 code, but
I can't find what I remember.

> The call into lwp_startup() does an spl0(). Before it does that, it also
> unlocks the previous LWP if any. If we enable interrupts before unlocking
> the previous LWP, we can end up taking an interrupt and trying to acquire
> a spinlock that is already held.

   In that case, I think not changing the interrupt masking when forking a 
new lwp is probably what is now desired.  The new process will inherit the
IPL level from the parent process (although I'm now sure what it actually
is for the various ports).

--
Michael L. Hitch			mhitch <at> montana.edu
Computer Consultant
Information Technology Center
Montana State University	Bozeman, MT	USA

(Continue reading)

Izumi Tsutsui | 15 Jun 15:26 2007
Picon

Re: yamt-idlelwp fallout for mips/cobalt?

ad <at> NetBSD.org wrote:

> > Looking back in the mail archives, this seems to be an attempt
> > to fix a locking against myself panic, so I suspect it's more likely
> > a locking error somewhere. 
> 
> The call into lwp_startup() does an spl0(). Before it does that, it also
> unlocks the previous LWP if any. If we enable interrupts before unlocking
> the previous LWP, we can end up taking an interrupt and trying to acquire
> a spinlock that is already held.

ddb trace on today's -current kernel shows:
---
Mounting all filesystems...
Mutex error: lockdebug_wantlock: locking against myself

lock address : 0x0000000080318d00 type     :               spin
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                  1
current cpu  :                  0 last held:                  0
current lwp  : 0x000000008fc8b000 last held: 0x000000008fc8b700
last locked  : 0x000000008018ad6c unlocked : 0x0000000080184010
owner field  : 000000000000000000 wait/spin:                0/1

panic: LOCKDEBUG
Stopped in pid 249.1 (nfsio) at netbsd:cpu_Debugger+0x4:        jr      ra
                bdslot: nop
db> tr
cpu_Debugger+4 (8fffe000,802f3370,d,0) ra 801a6508 sz 0
panic+190 (8fffe000,802f3370,d,0) ra 8019e218 sz 48
(Continue reading)

Andrew Doran | 15 Jun 15:33 2007
Picon

Re: yamt-idlelwp fallout for mips/cobalt?

On Fri, Jun 15, 2007 at 10:26:02PM +0900, Izumi Tsutsui wrote:

> ad <at> NetBSD.org wrote:
> 
> > > Looking back in the mail archives, this seems to be an attempt
> > > to fix a locking against myself panic, so I suspect it's more likely
> > > a locking error somewhere. 
> > 
> > The call into lwp_startup() does an spl0(). Before it does that, it also
> > unlocks the previous LWP if any. If we enable interrupts before unlocking
> > the previous LWP, we can end up taking an interrupt and trying to acquire
> > a spinlock that is already held.
> 
> ddb trace on today's -current kernel shows:
> ---
> Mounting all filesystems...
> Mutex error: lockdebug_wantlock: locking against myself
> 
> lock address : 0x0000000080318d00 type     :               spin
> shared holds :                  0 exclusive:                  1
> shares wanted:                  0 exclusive:                  1
> current cpu  :                  0 last held:                  0
> current lwp  : 0x000000008fc8b000 last held: 0x000000008fc8b700
> last locked  : 0x000000008018ad6c unlocked : 0x0000000080184010
> owner field  : 000000000000000000 wait/spin:                0/1
> 
> panic: LOCKDEBUG
> Stopped in pid 249.1 (nfsio) at netbsd:cpu_Debugger+0x4:        jr      ra
>                 bdslot: nop
> db> tr
(Continue reading)


Gmane