Subject: Re: yamt-idlelwp fallout for mips/cobalt?
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
From: Andrew Doran <ad@NetBSD.org>
List: port-cobalt
Date: 06/15/2007 14:33:27
On Fri, Jun 15, 2007 at 10:26:02PM +0900, Izumi Tsutsui wrote:

> ad@NetBSD.org wrote:
> 
> > > Looking back in the mail archives, this seems to be an attempt
> > > to fix a locking against myself panic, so I suspect it's more likely
> > > a locking error somewhere. 
> > 
> > The call into lwp_startup() does an spl0(). Before it does that, it also
> > unlocks the previous LWP if any. If we enable interrupts before unlocking
> > the previous LWP, we can end up taking an interrupt and trying to acquire
> > a spinlock that is already held.
> 
> ddb trace on today's -current kernel shows:
> ---
> Mounting all filesystems...
> Mutex error: lockdebug_wantlock: locking against myself
> 
> lock address : 0x0000000080318d00 type     :               spin
> shared holds :                  0 exclusive:                  1
> shares wanted:                  0 exclusive:                  1
> current cpu  :                  0 last held:                  0
> current lwp  : 0x000000008fc8b000 last held: 0x000000008fc8b700
> last locked  : 0x000000008018ad6c unlocked : 0x0000000080184010
> owner field  : 000000000000000000 wait/spin:                0/1
> 
> panic: LOCKDEBUG
> Stopped in pid 249.1 (nfsio) at netbsd:cpu_Debugger+0x4:        jr      ra
>                 bdslot: nop
> db> tr
> cpu_Debugger+4 (8fffe000,802f3370,d,0) ra 801a6508 sz 0
> panic+190 (8fffe000,802f3370,d,0) ra 8019e218 sz 48
> lockdebug_abort1+70 (8fffe000,802f3370,d,0) ra 8019ee20 sz 32
> lockdebug_wantlock+200 (8fffe000,802f3370,d,0) ra 80175e40 sz 40
> mutex_vector_enter+1a4 (8fffe000,802f3370,d,0) ra 80189ec0 sz 48
> wakeup+58 (8fffe000,802f3370,d,0) ra 801cf820 sz 32
> sowakeup+f0 (8fffe000,802f3370,d,0) ra 8002536c sz 40
> udp4_sendup+10c (8fffe000,802f3370,d,0) ra 80025928 sz 48
> udp_input+4d4 (8fffe000,14,11,8030b7b0) ra 8000e534 sz 176
> ipintr+b4 (8fffe000,14,11,8030b7b0) ra 8021e2fc sz 40
> netintr+68 (8fffe000,14,11,8030b7b0) ra 80230a34 sz 24
> softintr_dispatch+f4 (200,14,11,8030b7b0) ra 8022fc88 sz 56
> cpu_intr+e4 (200,14,11,8030b7b0) ra 802185d4 sz 56
> mips3_KernIntr+84 (cc6e6000,3d099f,3d095f,0) ra 80218bc0 sz 128
> mips3_lwp_trampoline+0 (cc6e6000,3d099f,3d095f,0) ra 0 sz 0
> User-level: pid 249.1
> db> 
> ---
> 
> so softintr(9) occurred (or was enabled) as soon as
> a kernel jumped to lwp_trampoline() before lwp_startup()
> was called.

Is this with or without your patch?

Thanks,
Andrew