Subject: Re: LK_RETRY and coda_lookup
To: Greg Troxel <>
From: Bill Studenmund <>
List: tech-kern
Date: 03/15/2006 15:07:47
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Mar 15, 2006 at 04:44:49PM -0500, Greg Troxel wrote:
> Bill Studenmund <> writes:
> > On Tue, Feb 28, 2006 at 12:38:30PM -0500, Greg Troxel wrote:
> > > In src/sys/coda/coda_vnops.c 1.46, I added LK_RETRY to vn_lock calls.
> > > This seems correct because vn_lock will (absent LK_RETRY) fail if the
> > > vnode is already locked (presumably by another thread), but it will
> > > first sleep until a wakeup happens.
> >=20
> > Sorry for the delay, but I think tis is wrong.
> no problem on delay (so leaving extra quoting).
> > > 1) Am I correct that some other process could have VXLOCK on the
> > >    vnode, that this is really a normal condition, and thus that
> > >    LK_RETRY is really in order?  (I would think LK_RETRY should only
> > >    be omitted as part of some deadlock avoidance procedure where the
> > >    calling code is doing retries, or something like that.)
> >=20
> > VXLOCK should only be used when we are recycling the vnode - when we ar=
> > blowing it away. It is set when we are changing the type of the vnode, =
> > part as all of the locking originally was done by the fs.
> So vn_lock sleeps if VXLOCK is set (which should be rare).  If VXLOCK
> is not set, it just calls VOP_LOCK.  coda_lock just does
>     return (lockmgr(&vp->v_lock, ap->a_flags, &vp->v_interlock));

That is fine.

> In lots of code, I see LK_RETRY (e.g. egrep LK_RETRY sys/vfs_vnops.c).
> So I don't understand when that flag should and should not be used -
> and why if it's wrong in coda vnops it's ok in cfs_vnops.c.
> It seems VOP_LOCK is supposed to sleep until succesful rather than
> returning EBUSY, and coda_lock uses lockmgr, so that should be ok.

I think the kernel is rather slopy with respect to LK_RETRY (or there's=20
something I don't understand :-) . I think it all stems from not wanting=20
to bother to catch errors. So you "retry", get the deadfs vnode, then let=
it error out your operation.

> I wonder if something in coda causes VXLOCK that shouldn't.
> If I shouldn't use LK_RETRY, should a vnop that fails to get a lock
> return EIO or some such?  It seems like it should not be a panic.  Or
> should it, and failing to get a lock on a vnode on which we *should*
> have a ref is a sign of incorrect behavior.

I think it should error out. I'm not certain onthe error code, but EIO=20
isn't bad.

> > > 2) Why does vn_lock sleep if it's going to fail?  Is this for
> > >    caller-managed retry loops?
> >=20
> > To wait for the cleanup to finish, I expect.
> thanks, makes sense.

Also, with the current, "Let deadfs error out," by waiting, we make sure=20
deadfs is ready to handle the operations.

> > > 3) For clarity, it seems that LK_INTERLOCK should be cleared after
> > >    ltsleep, and separately after VOP_LOCK.  I realize this is a case
> > >    of CSE but it makes it harder to realize that the rules are being
> > >    followed.
> >=20
> > If I understand the code right, I think it's a style thing. We go throu=
> > one or the other side of an if statement, and we clear it after the if.
> I think it's a style thing - I was making the point that the current
> style makes it hard to understand how the rules are being followed.
> It sounds like you think the code is right, and I'm not trying to
> change it, but I appreciate the feedback since I am gradually coming
> to understand vfs locking.

I don't know if KNF has a recomendation. If it does, let's go with=20
whatever it is. :-)

> With my recent change to coda_venus (1.20) and LK_RETRY, I can run 20
> of pax tarring up 350MB of data in coda to /dev/null in parallel.
> getcwd is failing within /coda/realm.  The userland/venus fid to inode
> mapping is out of sync with the kernel, and that's on my list to fix,
> and may resolve the getcwd.
> As alwyas, thanks for the locking advice.

NP. I'd really like us to have things like coda and afs working well. :-)

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.2.3 (NetBSD)