Subject: Re: nfsd: locking botch in op %d
To: None <>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 03/13/2001 14:07:10
>>> In the case of an aliased device node, ufs_vinit calls vput() on
>>> the old vnode just before initializing the new one.
>> [...] ufs_vinit, before it vput()s the old vnode, bashes the
>> vnodeops field to specfs's vnodeops.  And specfs's unlock routine is
>> genfs_nounlock, which doesn't actually do anything.  This means that
>> the VOP_UNLOCK in vput() is a no-op.
> Oops!


Actually, I can't help wondering how this *ever* worked.  There was a
time when that diskless machine worked fine with my NFS server, and I
haven't changed anything since then that I can see affecting this.

> Could you try changing the genfs_no{,is,un}lock{,ed} calls into the
> real-lock varieties and see what happens?

I didn't try that, since I don't know what else uses those routines,
and some uses of them may depend on their semantics.  (If you really
want me to, I can try that, but it wouldn't surprise me if it broke
something else, something that depends on the genfs_no* behavior being
what the name implies.)

What I did do, and it seems to have made the problem go away, is

--- /sources/latest-usr-src/sys/ufs/ufs/ufs_vnops.c	Tue Mar  7 18:19:42 2000
+++ ufs_vnops.c	Tue Mar 13 01:00:13 2001
@@ -1917,6 +1917,9 @@
 			nvp->v_data = vp->v_data;
 			vp->v_data = NULL;
+			/* With v_op bashed, vput's VOP_UNLOCK is a noop.
+			   But at this point vp is locked, so.... */
+			VOP_UNLOCK(vp,0);
 			vp->v_op = spec_vnodeop_p;

I can't pretend to believe that this is the right fix.  But it made my
symptoms go away; if it does likewise for the other person who had the
problem, I'd be inclined to say it is the locking problem I outlined.
What the proper fix is, that question I'll leave up to those who
actually understand the code.

					der Mouse

		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B