Subject: Re: kern/32535: processes stuck on vnlock
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 10/07/2006 10:15:04
The following reply was made to PR kern/32535; it has been noted by GNATS.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org
Subject: Re: kern/32535: processes stuck on vnlock
Date: Sat, 7 Oct 2006 12:10:45 +0200
On Tue, Sep 26, 2006 at 01:35:02AM +0000, Bill Studenmund wrote:
> Ok, here is some more analysis on the problem.
>
> The main issue is that vrele() can lock the vnode before calling=20
> VOP_INACTIVE(). Since we are calling vrele() on the parrent and we have=20
> hte child locked, this locking violates the vnode locking hierarchy; you=20
> can't lock a vnode's parent while holding the vnode's lock.
>
> I see three ways to fix this:
>
> 1) Do something along the lines of what I think Chuck was talking about,=20
> and create a work queue to handle destroying vnodes. The trick, though, is=
> =20
> that VOP_INACTIVE() isn't necessarily about destroying a vnode, it's=20
> telling the file system that the vnode is going on the free list. The main=
> =20
> user of this information (AFAIK) is NFS, which will zap a node if the=20
> silly-rename code has been triggered.
>
> We _could_ have a special worker thread do handle the VOP_INACTIVE
> calling, however this will happen every time a vnode gets put on the free
> list! Even if we added a flag so that we only did this processing if
> requested (i.e. some vnodes skipped calling VOP_INACTIVE()), we still have
> this weird case where we have the free list we have now, and we have a
> "freeing" list.
>
> My main concern is a case where a file system uses VOP_INACTIVE as an=20
> indication that it can release resources. I expect such a file system will=
> =20
> want VOP_INACTIVE calls, and this change will result in a performance hit.
>
> 2) We could re-work lookup() so that we don't release the directory lock=20
> if we're locking the child and we instead call vput(). I think this is the=
> =20
> best option as it gets rid of the real problem. I'll look into it, but=20
> if someone else wants to look into this, please do! I'm not sure how=20
> quickly I can look at it.
>
> 3) We add a vrele2() call that takes the vnode on which we're calling=20
> vrele() and another vnode that we have locked. It would process just like=
> =20
> vrele(), except instead of just locking the vnode, it tries to get the=20
> lock. If it can, it proceeds. If it can't, it releases the other vnode=20
> then blocks waiting for the first vnode's lock.
>
> Option 3 probably is the easiest solution. But I'm not sure what I think=20
> of it.
>
> Actually, if we did this as a locking primitive (vlock_with_held()), we=20
> could use it for ".." lookup too. It's the same issue as lookup on ".."...=
> =20
> :-)
Note that if someone writes a fix for this I'll be happy to try it.
I have a box which has to be rebooted -n -q periodically because of
processes stuck on vnlock. One day I'll change the disk system to get
rid of the null mounts, which will fix the issue :)
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--