tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/38273: "lockdebug_barrier: spin lock held" from ld_ataraid_start_raid0()

At Fri, 25 Apr 2008 17:15:04 +0000 (UTC), wrote:
Subject: Re: kern/38273: "lockdebug_barrier: spin lock held" from 
>  I've been trying my hand at looking deeper at this problem but I'm
>  having a difficult time figuring out which lock is which, and at this
>  point I'm not even sure if the mutex_vector_enter() in the stack
>  backtrace is the same as mutex_enter() in the source or not.
>  The first line in ldstart() is:
>       mutex_enter(&sc->sc_mutex);
>  Then a little bit later, before any mutex_exit(&sc->sc_mutex) there's a
>  call, through the sc_start function pointer, to the ld_ataraid_start_raid0()
>  routine.
>  The only locking I can see that ld_ataraid_start_raid0() does is:
>                       mutex_enter(&cbp->cb_buf.b_vp->v_interlock);
>  Is that the same lock as is used in ldstart(), i.e. the sc_mutex?
>  Interestingly I see that before and after calling biodone(), ldstart()
>  releases and then re-acquires the sc_mutex (if I'm interpreting this
>  right):
>                               mutex_exit(&sc->sc_mutex);
>                               biodone(bp);
>                               mutex_enter(&sc->sc_mutex);
>  Should the same be done before calling the sc_start function?
>  Or should ld_ataraid_start_raid0() not be doing any locking at all?

As far as I can tell I haven't seen any reply to this yet.

It's still happening.  I hadn't even got this far until today when
Juergen Hannken-Illjes suggested a working fix for my PR# 38636.

Now I'm back to this one.  I've CC'ed tech-kern once again to see if
fresh eyes might help spot something obvious.

FYI, here's what the crash looks like today:

Mutex error: lockdebug_barrier: spin lock held

lock address : 0x00000000d185d7ac type     :               spin
initialized  : 0x00000000c01f430c
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                  0
current cpu  :                  0 last held:                  0
current lwp  : 0x00000000d1e57380 last held: 0x00000000d1e57380
last locked  : 0x00000000c01f3cee unlocked : 0x00000000c01f3d6b
owner field  : 0x0000000000010600 wait/spin:                0/1

fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c05ac52c cs 8 eflags 246 cr2 bbbfb000 ilevel 6
Stopped in pid 857.1 (newfs) at netbsd:breakpoint+0x4:  popl    %ebp
db{0}> trace
breakpoint(c0afbae3,d1c3d8c8,c0b29800,c04e351f,6,1,0,0,d1c3d8c8,8) at 
panic(c0a9eddc,c0a9a5f7,c087af90,c0a9edf5,0,1000001,6,0,0,d1823b80) at 
 at netbsd:lockdebug_abort1+0xbb
mutex_vector_enter(d1823b80,0,cc4c0000,200,6,0,c01f3cee,c32c5f44,0,efff1749) at 
 at netbsd:ld_ataraid_start_raid0+0x2e2
ldstart(6,c31e860c,0,0,c04b358b,101,0,d1818830,0,c32cda00) at 
at netbsd:ldstrategy+0x171
physio(c01f4770,0,4500,0,c01f3500,d1c3dc5c,d1c3db4c,c04d64b0,4500,d1c3dc5c) at 
ldwrite(4500,d1c3dc5c,10,8,d1b09720,d1c3dc5c,6,d1e57380,d1c3dbe4,d1b09680) at 
at netbsd:cdev_write+0x70
 at netbsd:spec_write+0xa0
VOP_WRITE(d1b09680,d1c3dc5c,10,cc4a6a80,0,0,2,16,200,bbbb5000) at 
 at netbsd:vn_write+0xb1
dofilewrite(4,d1e1c980,bbbb5000,200,d1c3dcc4,0,d1c3dd28,c05b5b7f,0,0) at 
 at netbsd:sys_pwrite+0xc7
syscall(d1c3dd48,b3,ab,1f,1f,0,1749efff,bfbfc8b8,0,0) at netbsd:syscall+0xab
db{0}> x/I 0x00000000c01f3cee
netbsd:ldstart+0x1e:    testl   %esi,%esi
db{0}> x/I 0x00000000d1e57380
0xd1e57380:     addb    %al,0(%eax)
db{0}> x/I 0x00000000c01f3d6b
netbsd:ldstart+0x9b:    addl    $0x1c,%esp
db{0}> x/I 0x00000000c01f430c
netbsd:ldattach+0x2c:   testb   $0x1,0x128(%edi)
db{0}> call simple_lock_dump
Symbol not found

                                                Greg A. Woods
                                                Planix, Inc.

<>     +1 416 489-5852 x122

Attachment: pgpH00IdSmfA_.pgp
Description: PGP signature

Home | Main Index | Thread Index | Old Index