Subject: Re: userid partitioned swap spaces.
To: NetBSD Kernel Technical Discussion List <tech-kern@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: tech-kern
Date: 12/19/1998 14:15:32
[ On Sat, December 19, 1998 at 07:05:04 (+0200), Lucio de Re wrote: ]
> Subject: Re: userid partitioned swap spaces. 
>
> According to woods@most.weird.com (Greg A. Woods) :
> > 
> > The system should be more robust than relying on an external operator to
> > intervene in these kinds of situations.
> 
> I agree entirely, but it would be nice, when robustness isn't 
> available, to permit human intervention.  There are lots of instances 
> where the decision making is just too complex to throw mere instruction 
> cycles at it.

Human intervention (if authorized) should indeed be permitted, but I
don't think this is a case where there's any need for "complex decision
making" -- the algorithms are very simple and well known and with their
implementation the "decision" would still be in the hands of the
applcation progam(s).  It's only the VM accounting that's in any way
"difficult", and it's only "difficult" because of some other design
decisions in the system.  Most of these issues have long ago been
resolved in other systems and all that should remain is choosing the
most comfortable solution for NetBSD (and then of course the effort of
"making it so" in running code).

I would like to point out too that my SIGOOVM proposal doesn't even
require the VM accounting be complete and accurate, at least not for a
rough-cut first-attempt at automating recovery in a manner that
guarantees the least damage to user processes.  It's only when you get
to being nasty with SIGDANGER and especially SIGKILL that you want to
make a "best effort" at doing the least "damage" necessary to effect
full recovery.

> Ian suggests that the system may still have kittens, but in a better 
> controlled environment, this while we find a bigger stick to convince 
> the cow to behave itself.

That's a very humorous addition to the analogy, but I don't think it
fits.

Any solution which doesn't allow the system to recover and continue
operation all by itself is not going to sell to anyone who must have a
"never over-commit or equivalent" solution.  I sure as heck want to use
NetBSD in 27x7 "lights out" operations where the best human intervention
is likely only to be "the big red switch", and that's an unacceptable
way to effect recovery of this common of an operating exception.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>