netbsd-bugs: Re: kern/32535: processes stuck on vnlock

Subject: Re: kern/32535: processes stuck on vnlock
To: None <gnats-bugs@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 10/07/2006 12:10:45
On Tue, Sep 26, 2006 at 01:35:02AM +0000, Bill Studenmund wrote:
>  Ok, here is some more analysis on the problem.
>  
>  The main issue is that vrele() can lock the vnode before calling=20
>  VOP_INACTIVE(). Since we are calling vrele() on the parrent and we have=20
>  hte child locked, this locking violates the vnode locking hierarchy; you=20
>  can't lock a vnode's parent while holding the vnode's lock.
>  
>  I see three ways to fix this:
>  
>  1) Do something along the lines of what I think Chuck was talking about,=20
>  and create a work queue to handle destroying vnodes. The trick, though, is=
>  =20
>  that VOP_INACTIVE() isn't necessarily about destroying a vnode, it's=20
>  telling the file system that the vnode is going on the free list. The main=
>  =20
>  user of this information (AFAIK) is NFS, which will zap a node if the=20
>  silly-rename code has been triggered.
>  
>  We _could_ have a special worker thread do handle the VOP_INACTIVE
>  calling, however this will happen every time a vnode gets put on the free
>  list! Even if we added a flag so that we only did this processing if
>  requested (i.e. some vnodes skipped calling VOP_INACTIVE()), we still have
>  this weird case where we have the free list we have now, and we have a
>  "freeing" list.
>  
>  My main concern is a case where a file system uses VOP_INACTIVE as an=20
>  indication that it can release resources. I expect such a file system will=
>  =20
>  want VOP_INACTIVE calls, and this change will result in a performance hit.
>  
>  2) We could re-work lookup() so that we don't release the directory lock=20
>  if we're locking the child and we instead call vput(). I think this is the=
>  =20
>  best option as it gets rid of the real problem. I'll look into it, but=20
>  if someone else wants to look into this, please do! I'm not sure how=20
>  quickly I can look at it.
>  
>  3) We add a vrele2() call that takes the vnode on which we're calling=20
>  vrele() and another vnode that we have locked. It would process just like=
>  =20
>  vrele(), except instead of just locking the vnode, it tries to get the=20
>  lock. If it can, it proceeds. If it can't, it releases the other vnode=20
>  then blocks waiting for the first vnode's lock.
>  
>  Option 3 probably is the easiest solution. But I'm not sure what I think=20
>  of it.
>  
>  Actually, if we did this as a locking primitive (vlock_with_held()), we=20
>  could use it for ".." lookup too. It's the same issue as lookup on ".."...=
>  =20
>  :-)

Note that if someone writes a fix for this I'll be happy to try it.
I have a box which has to be rebooted -n -q periodically because of
processes stuck on vnlock. One day I'll change the disk system to get
rid of the null mounts, which will fix the issue :)

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--