At Fri, 25 Apr 2008 17:15:04 +0000 (UTC), Me-planix.com wrote:
Subject: Re: kern/38273: "lockdebug_barrier: spin lock held" from
ld_ataraid_start_raid0()
>
> I've been trying my hand at looking deeper at this problem but I'm
> having a difficult time figuring out which lock is which, and at this
> point I'm not even sure if the mutex_vector_enter() in the stack
> backtrace is the same as mutex_enter() in the source or not.
>
> The first line in ldstart() is:
>
> mutex_enter(&sc->sc_mutex);
>
> Then a little bit later, before any mutex_exit(&sc->sc_mutex) there's a
> call, through the sc_start function pointer, to the ld_ataraid_start_raid0()
> routine.
>
> The only locking I can see that ld_ataraid_start_raid0() does is:
>
> mutex_enter(&cbp->cb_buf.b_vp->v_interlock);
>
> Is that the same lock as is used in ldstart(), i.e. the sc_mutex?
>
> Interestingly I see that before and after calling biodone(), ldstart()
> releases and then re-acquires the sc_mutex (if I'm interpreting this
> right):
>
> mutex_exit(&sc->sc_mutex);
> biodone(bp);
> mutex_enter(&sc->sc_mutex);
>
> Should the same be done before calling the sc_start function?
>
> Or should ld_ataraid_start_raid0() not be doing any locking at all?
As far as I can tell I haven't seen any reply to this yet.
It's still happening. I hadn't even got this far until today when
Juergen Hannken-Illjes suggested a working fix for my PR# 38636.
Now I'm back to this one. I've CC'ed tech-kern once again to see if
fresh eyes might help spot something obvious.
FYI, here's what the crash looks like today:
Mutex error: lockdebug_barrier: spin lock held
lock address : 0x00000000d185d7ac type : spin
initialized : 0x00000000c01f430c
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000d1e57380 last held: 0x00000000d1e57380
last locked : 0x00000000c01f3cee unlocked : 0x00000000c01f3d6b
owner field : 0x0000000000010600 wait/spin: 0/1
panic: LOCKDEBUG
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c05ac52c cs 8 eflags 246 cr2 bbbfb000 ilevel 6
Stopped in pid 857.1 (newfs) at netbsd:breakpoint+0x4: popl %ebp
db{0}> trace
breakpoint(c0afbae3,d1c3d8c8,c0b29800,c04e351f,6,1,0,0,d1c3d8c8,8) at
netbsd:breakpoint+0x4
panic(c0a9eddc,c0a9a5f7,c087af90,c0a9edf5,0,1000001,6,0,0,d1823b80) at
netbsd:panic+0x1b8
lockdebug_abort1(c0a9edf5,1,0,0,c0aa38ce,d185d6cc,d1c3d92c,c049a1ca,c31f7e60,c0b25fa4)
at netbsd:lockdebug_abort1+0xbb
mutex_vector_enter(d1823b80,0,cc4c0000,200,6,0,c01f3cee,c32c5f44,0,efff1749) at
netbsd:mutex_vector_enter+0x437
ld_ataraid_start_raid0(d185d6cc,c31e860c,d1c3da4c,200,c32cda00,d185d7ac,d185d750,0,c31e860c,d185d6cc)
at netbsd:ld_ataraid_start_raid0+0x2e2
ldstart(6,c31e860c,0,0,c04b358b,101,0,d1818830,0,c32cda00) at
netbsd:ldstart+0x6e
ldstrategy(c31e860c,200,200,1,0,d181881c,d1818830,d1818834,bbbb5000,d1e57380)
at netbsd:ldstrategy+0x171
physio(c01f4770,0,4500,0,c01f3500,d1c3dc5c,d1c3db4c,c04d64b0,4500,d1c3dc5c) at
netbsd:physio+0x251
ldwrite(4500,d1c3dc5c,10,8,d1b09720,d1c3dc5c,6,d1e57380,d1c3dbe4,d1b09680) at
netbsd:ldwrite+0x35
cdev_write(4500,d1c3dc5c,10,2,d1b09720,d17fd000,d1c3db8c,c0522bf7,d1b09720,1)
at netbsd:cdev_write+0x70
spec_write(d1c3dbe4,bbbf8000,c087c740,d1b09680,2,20002,d1c3dbfc,c052e058,c087c240,d1b09680)
at netbsd:spec_write+0xa0
VOP_WRITE(d1b09680,d1c3dc5c,10,cc4a6a80,0,0,2,16,200,bbbb5000) at
netbsd:VOP_WRITE+0x6c
vn_write(d1e1c980,d1c3dcc4,d1c3dc5c,cc4a6a80,0,ffffffff,d1c3dc8c,c053632c,d1c3dc6c,d1e1c900)
at netbsd:vn_write+0xb1
dofilewrite(4,d1e1c980,bbbb5000,200,d1c3dcc4,0,d1c3dd28,c05b5b7f,0,0) at
netbsd:dofilewrite+0x75
sys_pwrite(d1e57380,d1c3dd00,d1c3dd28,bbbfb000,bbbfb000,d1ea2dd8,2,4,bbbb5000,200)
at netbsd:sys_pwrite+0xc7
syscall(d1c3dd48,b3,ab,1f,1f,0,1749efff,bfbfc8b8,0,0) at netbsd:syscall+0xab
db{0}> x/I 0x00000000c01f3cee
netbsd:ldstart+0x1e: testl %esi,%esi
db{0}> x/I 0x00000000d1e57380
0xd1e57380: addb %al,0(%eax)
db{0}> x/I 0x00000000c01f3d6b
netbsd:ldstart+0x9b: addl $0x1c,%esp
db{0}> x/I 0x00000000c01f430c
netbsd:ldattach+0x2c: testb $0x1,0x128(%edi)
db{0}> call simple_lock_dump
Symbol not found
db{0}>
--
Greg A. Woods
Planix, Inc.
<woods%planix.com@localhost> +1 416 489-5852 x122 http://www.planix.com/
Attachment:
pgpH00IdSmfA_.pgp
Description: PGP signature