Subject: Re: pr/35143 and layer_node_find()
To: Chuck Silvers <firstname.lastname@example.org>
From: Bill Studenmund <email@example.com>
Date: 11/30/2006 10:10:22
Content-Type: text/plain; charset=us-ascii
On Thu, Nov 30, 2006 at 08:27:30AM -0800, Chuck Silvers wrote:
> On Wed, Nov 29, 2006 at 11:20:19AM -0800, Bill Studenmund wrote:
> > But we can't return a valid vp in this case. To get into this corner ca=
> > we have to have another thread in the kernel actively in the process of=
> > cleaning said vnode. To be honest, I'm not 100% sure how a big-lock ker=
> > got into this case, but it did...
> ... and now I realize that my proposal doesn't actually fix the problem
> in this PR. as you say, we don't understand how it could have happened y=
> I think we need to figure that out before we decide on a fix.
Ok, here's an idea on how it can happen on our current kernel.
We have one process holding the lock on the lower vnode, upper vnode is=20
Then another process comes into the layered file system, does a lookup on=
vnode, and blocks in VOP_LOOKUP() on the lower layer waiting for the lock.
Then a thrid process comes in and decides to recycle a vnode. It gets the=
layer vnode, sets VXLOCK, then goes to sleep waiting to get the stack's=20
First process finishes doing whatever, and releases the lock on the stack.=
Both the second and third processes are marked runnable.
Second process gets the lock and proceeds to get the vnode above the lower=
node, the same vnode the third process wants to recycle. vget() blocks as=
it sees the VXLOCK flag set.
We are now deadlocked.
We have to have the vget() not wait if it sees VXLOCK.
I still don't see what's wrong with letting the being-destroyed nodes stay=
in the hash table. For them to have VXLOCK set, there has to be a thread=20
reclaiming them, so they will be removed from the hash list in due time.
The only other alternative I can see is to have vget() detect VXLOCK,=20
unlock the lower node, relock the lower node, and try again. That would=20
give the reclaim time to remove the dying upper node.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)
-----END PGP SIGNATURE-----