Subject: Re: Enhancing the NetBSD kernel
To: None <hpeyerl@beer.org, tech-kern@netbsd.org>
From: Ross Harvey <ross@teraflop.com>
List: tech-kern
Date: 11/13/1998 15:45:58
> From: Herb Peyerl <hpeyerl@beer.org>
>
> maximum entropy <entropy@zippy.bernstein.com>  wrote:
>  > Would it be particularly difficult to implement the actual locking?
>  > I'm just wondering why it hasn't been done...is there a technical
>  > reason, or is it just because no one has volunteered to write the code?
>
> I thought somewhere along the way, someone (Henry Spencer?) had 
> proven logically or algorithmically that it was impossible to 
> implement correctly.


It's irrelevant whether absolute mathematical assurances are possible.
Also, both monitored and non-monitored lock specifications exist...which
was he talking about? In an engineering discipline what matters is whether
the odds of failure are less than existing problems due to disk errors,
power outages (even if you have a UPS those batteries don't run forever),
software bugs...etc.  The exact mode of failure (what if it is retryable?)
can be much more important also.

Experience with commercial Unix shows that NFS file locking can in fact be
made to work. Sure, the folklore and conventional wisdom is highly biased
against it, partly because it was so problematic in the early, pre-POSIX,
pre-X/Open CAE XNFS-spec days...like when Ultrix and Sun did it differently.

I am holding the X/Open XNFS V3 spec in my hand, which cost a bit of money
a year ago, though much of it is on-line today. It specifies a procedure
by which NFS file locking can work despite server or client crashes. Various
tricks are used for crash recovery, e.g., there is a post-reboot grace period
during which clients may reestablish old locks but not make new locks.

It's a pity we don't have it, as it is just about the only missing piece
--but a big one-- w.r.t. the use of NetBSD as a file server.

There are several reasons it hasn't been done:

	* it's an icky subject that is disparaged by folklore

	* it has tricky kernel-user issues...you need to feed lock state
	  out of the kernel, and it isn't the netbsd way to hack up a
	  kludge interface, so it's kind of waiting on an officially-
	  sactioned NetBSD event interface. (My inclination would be to
	  make a `lock socket' and hack it. :-)

	* posix file locking is ... special ...

So, it will be an interesting trip for some sucker^Wvolunteer ... through
the kernel, NFS code, user-kernel interface definition, RPC, XDR, various
daemons ... lots of complex interoperability tests...

I had planned to work on this, but I've had to drop those plans since
getting officially involved with port-alpha, as alpha needs some more basic
things (X servers, platform support) a lot more.

If someone else wants to volunteer I can give them a roadmap.

  --Ross Harvey