Subject: Re: vnode locking in coda_lookup
To: Greg Troxel <gdt@ir.bbn.com>
From: Bill Stouder-Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 04/05/2007 14:24:27
--Q68bSM7Ycu6FN28Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Thu, Apr 05, 2007 at 01:27:12PM -0400, Greg Troxel wrote:
> > Thanks - I do follow that we need to have an order to avoid deadlocks,
> > and what our order is. What's not 100% clear in my head is the exact
> > relationship between the interlock and the vnode lock, which has to be
> > held for each possible operation, and mostly what can happen after vr=
ef
> > but without holding the vnode lock.
>=20
> Interlock is fine. You don't take other locks while holding the interlo=
ck.=20
> You at most trade the interlock on a vnode for the lock on said vnode. =
And=20
> the loking code handles this correctly.
>=20
> So is this right?
>=20
> interlock protects
> access to vnode lock
No. The idea is that you CAN pass the interlock into the locking routine=20
and that if you get the lock, no one will have changed things covered by=20
the interlock before you get the lock.
Well, that's the idea. Life isn't really like that. The problem is that if=
=20
someone else has the lock, we have to release the interlock to sleep.
What will happen 100% of the time is that we will get the interlock on the=
=20
lock (this is a detail of lockmgr, which will change when we move to=20
mutexes) before we release the passed-in interlock.
> reference counts
Yes, reference counts and flags.
> vnode lock protects
> all other changes to vnode contents
Yep.
> So, when holding vnode locks, one can be assured of no changes, and do
> multi-step operations which might temporarily violate invariants.
Yes.
> When holding a reference, one can be sure that the vnode will continue
> to exist, but not that it won't be revoked.
Yes. You will continue to have struct vnode on the other end that will not=
=20
be something different than what you started with. It may die and start=20
spurting errors, but it won't say all of a sudden become an NFS vnode.
> Reading vfs_lookup, I see LK_RETRY all over the place, along with
> failure to check vn_lock return values. Not saying that argues for it
> being correct in coda_lookup, but it seems there's a lot of LK_RETRY
> trouble about.
Yeah, we're really sloppy about it.
> > In the case of relocking the parent, we also have a vref (or rather o=
ur
> > caller must), so can that ever reasonably fail? Should I panic if
> > vn_lock (w/o LK_RETRY) fails?
>=20
> Don't panic. Return an error. This case (".." lookup) is why I added=20
> PDIRUNLOCK. I am not sure if chuq removed it.
>=20
> PDIRUNLOCK appears to have been removed in a commit
> kere/vfs_lookup.c:1.73 by chs on 2006/12/09.
>=20
> I don't understand how coda_lookup can handle not being able to relock
> the parent (without panicing, which I realize isn't exactly "handle").
I think the idea now is that you should to LK_RETRY, and you'll end up=20
locking something. It may not be the vnode you want, but it will get=20
locked. :-|
Take care,
Bill
--Q68bSM7Ycu6FN28Q
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)
iD8DBQFGFWkLWz+3JHUci9cRAv0UAJ4kSldUpMQH2Bt9O6fQPCYrPMVHhgCgjPcm
SUwj2z1xAN5uPnFRiAOc3sU=
=YQm+
-----END PGP SIGNATURE-----
--Q68bSM7Ycu6FN28Q--