Subject: Re: vn_lock(LK_RETRY) (was: Re: CVS commit: src/sys/miscfs)
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 06/18/2004 19:57:25
--LQksG6bCIzRHxTLp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jun 18, 2004 at 10:23:07AM +0900, YAMAMOTO Takashi wrote:
> [moved from source-changes@]
>=20
> > On Thu, Jun 17, 2004 at 11:02:07AM +0900, YAMAMOTO Takashi wrote:
> > point was to handle errors in looking up '..'. If there's an error gett=
ing=20
> > '..' and we can't then re-lock '.', this flag tells callers something=
=20
> > happened.
>=20
> lockmgr() will panic in that case.  thus no errors are returned.
> usually, VOP_LOCK won't be called because VXLOCK is set in that case.
> layered filesystems are broken in that respect, though.

Actually, lockmgr() will only panic if DIAGNOSTIC is defined. I'm not sure=
=20
if I really like what it does otherwise, which is just continue on. I=20
think I'd rather it return some sort of error. But I'm not willing to=20
fight for that today..

> > > > Yes, for it to happen we have to have
> > > > wandered into the weeds, but this code methodology is to try and he=
lp us=20
> > > > get somewhere a bit safer. i.e. not add more error states on top if=
 we=20
> > > > don't have to.
> > >=20
> > > we have many code which assume that vn_lock with LK_RETRY never fails.
> > > are you going to add error checks on all of them?
> > > i think that it just bloats the code without any benefits.
> >=20
> > The difference here is that you're changing the state of PDIRUNLOCK. Do=
n't=20
> > clear it if you don't know the lock succeeded.
>=20
> actually, there's no much differences.
> if vn_lock(LK_RETRY) can fail,
> all callers should check it and shouldn't unlock the vnode after an error.
> it's better to make sure that vn_lock(LK_RETRY) won't fail rather than
> letting all callers check error conditions, IMO.

Well, then I guess we'll have to disagree. I'm always concerned by what=20
happens if a routine that "can't fail" realizes it is going to fail. It=20
either has to do something like panic, or it has to pretend things are ok.=
=20
If instead it can fail, it can tell the upper layers that something is=20
wrong. They can choose to not do anything, but they've been told.

Also, consider something like a distributed file system with a lock=20
manager (as in a separate server). Say you can't contact the lock manager.=
=20
Do you really sit there in uninterruptable sleep for ever? Yes, how and=20
when you decide to give up is something the admin decides (as it should=20
be). But the scenario shows a reason to have vn_lock(LK_RETRY) fail.

Take care,

Bill

--LQksG6bCIzRHxTLp
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFA06uVWz+3JHUci9cRAg03AJ9c0qybWynvlcdBFs0OMEaQQRPBDQCeK1ga
wniTDz8vq/xMQm76hO5YlRc=
=Ao7k
-----END PGP SIGNATURE-----

--LQksG6bCIzRHxTLp--