Subject: current + LOCKDEBUG vs fifos: locking against myself panic
To: None <tech-kern@netbsd.org>
From: Bill Sommerfeld <sommerfeld@netbsd.org>
List: tech-kern
Date: 09/22/2001 17:46:56
A recent change (probably related to ubcperf, but i'm not 100%
certain), causes a lock leak in ffs_softdep.c; the "last locked"
file/line reported was here:

		FREE_LOCK(&lk);
===>>>		simple_lock(&uobj->vmobjlock);		<<<===
		error = (uobj->pgops->pgo_put)(uobj, 0, 0,
		    PGO_ALLPAGES|PGO_CLEANIT|
		    (waitfor == MNT_NOWAIT ? 0: PGO_SYNCIO));
		if (waitfor == MNT_WAIT) {
			drain_output(vp, 0);
		}
		ACQUIRE_LOCK(&lk);

the pgo_put routine is supposed to drop the lock, but didn't.

I added assertions to flush_inodedep_deps to catch this case (i.e.,
returning from pgo_put with the lock still held, and it triggered
here:

cpu_Debugger(2d,0,2,c6e0bd00,c02f34c4) at cpu_Debugger+0x4
panic(c04a3a60,c7055c54,c03226a0,0,0) at panic+0xa0
flush_inodedep_deps(c09c1000,3c0a,c05dc2e0,0,0) at flush_inodedep_deps+0xf0
softdep_sync_metadata(c6e0bdf4) at softdep_sync_metadata+0x6b
ffs_full_fsync(c6e0bdf4,c7055c54,c6dfa720,5,c6f67e68) at ffs_full_fsync+0x205
ffs_fsync(c6e0bdf4,1b,8,0,c0487e00) at ffs_fsync+0x3a
VOP_FSYNC(c7055c54,ffffffff,5,0,0) at VOP_FSYNC+0x52
vinvalbuf(c7055c54,1,ffffffff,c6dfa720,0,0) at vinvalbuf+0x75
vclean(c7055c54,8,c6dfa720) at vclean+0x8e
vgonel(c7055c54,c6dfa720,c6f6b5fc,c7055c54,c6e0bee8) at vgonel+0x3f
vrecycle(c7055c54,0,c6dfa720,c7055c54,c6dfa720) at vrecycle+0x44
ufs_inactive(c6e0bef4,c04880e0,c7055c54,c6dfa720,c6e0bf24) at ufs_inactive+0xfd
VOP_INACTIVE(c7055c54,c6dfa720,c7055c54,c04869cc,4d3) at VOP_INACTIVE+0x2b
vput(c7055c54,c05dc2e0,c6f740c0,0,0) at vput+0x137
handle_workitem_remove(c6f740c0) at handle_workitem_remove+0x138
softdep_process_worklist(0) at softdep_process_worklist+0x1b1
sched_sync(c6dfa720) at sched_sync+0x163

so, what was this broken vnode?

db{0}> show vnode 0xc7055c54
OBJECT 0xc7055c54: locked=1, pgops=0xc05dc5e4, npages=0, refs=0

VNODE flags 100<XLOCK>
mp 0xc09c8e00 numoutput 0 size 0x0
data 0xc6f6b5fc usecount 0 writecount 0 holdcnt 0 numoutput 0
type VFIFO(7) tag VT_UFS(1) id 0x590 mount 0xc09c8e00 typedata 0x0

It appears that VOP_FSYNC() on a FIFO will blow up, because all the
*_fifoop_entres op vectors don't contain a putpages() operation, so it
winds up calling vn_default_error (which returns EOPNOTSUPP and
doesn't unlock the vnode).

Anyone have strong opinions on the right way to fix this?

						- Bill