At Fri, 25 Apr 2008 17:15:04 +0000 (UTC), Me-planix.com wrote: Subject: Re: kern/38273: "lockdebug_barrier: spin lock held" from ld_ataraid_start_raid0() > > I've been trying my hand at looking deeper at this problem but I'm > having a difficult time figuring out which lock is which, and at this > point I'm not even sure if the mutex_vector_enter() in the stack > backtrace is the same as mutex_enter() in the source or not. > > The first line in ldstart() is: > > mutex_enter(&sc->sc_mutex); > > Then a little bit later, before any mutex_exit(&sc->sc_mutex) there's a > call, through the sc_start function pointer, to the ld_ataraid_start_raid0() > routine. > > The only locking I can see that ld_ataraid_start_raid0() does is: > > mutex_enter(&cbp->cb_buf.b_vp->v_interlock); > > Is that the same lock as is used in ldstart(), i.e. the sc_mutex? > > Interestingly I see that before and after calling biodone(), ldstart() > releases and then re-acquires the sc_mutex (if I'm interpreting this > right): > > mutex_exit(&sc->sc_mutex); > biodone(bp); > mutex_enter(&sc->sc_mutex); > > Should the same be done before calling the sc_start function? > > Or should ld_ataraid_start_raid0() not be doing any locking at all? As far as I can tell I haven't seen any reply to this yet. It's still happening. I hadn't even got this far until today when Juergen Hannken-Illjes suggested a working fix for my PR# 38636. Now I'm back to this one. I've CC'ed tech-kern once again to see if fresh eyes might help spot something obvious. FYI, here's what the crash looks like today: Mutex error: lockdebug_barrier: spin lock held lock address : 0x00000000d185d7ac type : spin initialized : 0x00000000c01f430c shared holds : 0 exclusive: 1 shares wanted: 0 exclusive: 0 current cpu : 0 last held: 0 current lwp : 0x00000000d1e57380 last held: 0x00000000d1e57380 last locked : 0x00000000c01f3cee unlocked : 0x00000000c01f3d6b owner field : 0x0000000000010600 wait/spin: 0/1 panic: LOCKDEBUG fatal breakpoint trap in supervisor mode trap type 1 code 0 eip c05ac52c cs 8 eflags 246 cr2 bbbfb000 ilevel 6 Stopped in pid 857.1 (newfs) at netbsd:breakpoint+0x4: popl %ebp db{0}> trace breakpoint(c0afbae3,d1c3d8c8,c0b29800,c04e351f,6,1,0,0,d1c3d8c8,8) at netbsd:breakpoint+0x4 panic(c0a9eddc,c0a9a5f7,c087af90,c0a9edf5,0,1000001,6,0,0,d1823b80) at netbsd:panic+0x1b8 lockdebug_abort1(c0a9edf5,1,0,0,c0aa38ce,d185d6cc,d1c3d92c,c049a1ca,c31f7e60,c0b25fa4) at netbsd:lockdebug_abort1+0xbb mutex_vector_enter(d1823b80,0,cc4c0000,200,6,0,c01f3cee,c32c5f44,0,efff1749) at netbsd:mutex_vector_enter+0x437 ld_ataraid_start_raid0(d185d6cc,c31e860c,d1c3da4c,200,c32cda00,d185d7ac,d185d750,0,c31e860c,d185d6cc) at netbsd:ld_ataraid_start_raid0+0x2e2 ldstart(6,c31e860c,0,0,c04b358b,101,0,d1818830,0,c32cda00) at netbsd:ldstart+0x6e ldstrategy(c31e860c,200,200,1,0,d181881c,d1818830,d1818834,bbbb5000,d1e57380) at netbsd:ldstrategy+0x171 physio(c01f4770,0,4500,0,c01f3500,d1c3dc5c,d1c3db4c,c04d64b0,4500,d1c3dc5c) at netbsd:physio+0x251 ldwrite(4500,d1c3dc5c,10,8,d1b09720,d1c3dc5c,6,d1e57380,d1c3dbe4,d1b09680) at netbsd:ldwrite+0x35 cdev_write(4500,d1c3dc5c,10,2,d1b09720,d17fd000,d1c3db8c,c0522bf7,d1b09720,1) at netbsd:cdev_write+0x70 spec_write(d1c3dbe4,bbbf8000,c087c740,d1b09680,2,20002,d1c3dbfc,c052e058,c087c240,d1b09680) at netbsd:spec_write+0xa0 VOP_WRITE(d1b09680,d1c3dc5c,10,cc4a6a80,0,0,2,16,200,bbbb5000) at netbsd:VOP_WRITE+0x6c vn_write(d1e1c980,d1c3dcc4,d1c3dc5c,cc4a6a80,0,ffffffff,d1c3dc8c,c053632c,d1c3dc6c,d1e1c900) at netbsd:vn_write+0xb1 dofilewrite(4,d1e1c980,bbbb5000,200,d1c3dcc4,0,d1c3dd28,c05b5b7f,0,0) at netbsd:dofilewrite+0x75 sys_pwrite(d1e57380,d1c3dd00,d1c3dd28,bbbfb000,bbbfb000,d1ea2dd8,2,4,bbbb5000,200) at netbsd:sys_pwrite+0xc7 syscall(d1c3dd48,b3,ab,1f,1f,0,1749efff,bfbfc8b8,0,0) at netbsd:syscall+0xab db{0}> x/I 0x00000000c01f3cee netbsd:ldstart+0x1e: testl %esi,%esi db{0}> x/I 0x00000000d1e57380 0xd1e57380: addb %al,0(%eax) db{0}> x/I 0x00000000c01f3d6b netbsd:ldstart+0x9b: addl $0x1c,%esp db{0}> x/I 0x00000000c01f430c netbsd:ldattach+0x2c: testb $0x1,0x128(%edi) db{0}> call simple_lock_dump Symbol not found db{0}> -- Greg A. Woods Planix, Inc. <woods%planix.com@localhost> +1 416 489-5852 x122 http://www.planix.com/
Attachment:
pgpH00IdSmfA_.pgp
Description: PGP signature