Subject: Re: pr/35143 and layer_node_find()
To: Chuck Silvers <>
From: Bill Studenmund <>
List: tech-kern
Date: 12/06/2006 21:59:32
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Dec 06, 2006 at 09:01:33PM -0800, Chuck Silvers wrote:
> On Sun, Dec 03, 2006 at 01:06:59PM -0800, Bill Studenmund wrote:
> > > wouldn't getcleanvnode() skip over this locked VLAYER vnode?
> >=20
> > Hmm... It's supposed to.
> >=20
> > Please see Darrin's analysis in the PR. If you can see something that's=
> > not right in it, please comment!
> I've figured out what's going wrong with the VOP_ISLOCKED() call.
> lockstatus() can return 0 incorrectly if the lk_flags contains LK_WANT_EX=
> or LK_WANT_UPGRADE.  so if we have this sequence of events then we'll
> hit the deadlock:
> thread 1 locks the lower vnode with LK_SHARED and sleeps.
> thread 2 sets LK_WANT_EXCL on the vnode lock and sleeps.
> thread 1 wakes up and releases the vnode lock.
> thread 3 tries to recycle the upper vp, set XLOCK but sleeps on VOP_LOCK()
>          because LK_WANT_EXCL is set
> thread 2 wakes up, clears LK_WANT_EXCL, finishes acquiring the lock,
>          and then calls vget() on the upper vnode and sleeps on XLOCK.
> now threads 2 and 3 are deadlocked.
> the attached patch fixes this problem, and with both this patch and
> my other patch for the vrele() issue applied, a stress test survives
> much longer than it previously took to trigger a hang.

What eventually kills it?

> any objection to going with this approach?

We DEFINITELY need this patch. How exactly did you figure this out? :-)

I still am uncomfortable with us calling vget() without LK_NOWAIT as it=20
seems like a deadlock waiting to happen, protected by the race in the=20
getnewvnode() test. Also, if we had a better way to recover from the=20
reaping case, we wouldn't need the test in getnewvnode(). :-)

However I agree totally with your patch on everything except this one=20
point, so it's probably better to get this into 4.0 and keep thinking than=
to not.

You earlier sounded like you didn't like my thought (due to the=20
multi-instance aspect) but that you didn't object to it. Would you mind me=
making that change after you check all this in (and after I get your=20
stress tool & run it)?

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.3 (NetBSD)