Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: Charles M. Hannum <root@ihack.net>
From: Matthew Dillon <dillon@apollo.backplane.com>
List: tech-userlevel
Date: 07/13/1999 16:56:26
:>     And disallowing overcommit also does not give applications the *choice*
:>     of dealing gracefully, because they often cannot deal with the
:>     situation where they might be refused a reasonable request for memory.
:
:That's objectively false.  The application could do something useful
:if it wanted to.  That most applications don't isn't relevant.  The
:system can at least provide the mechanism.
:
:>     But back to your 1000-hour simulation:  If you are running it on an
:>     environment designed to deal with thousand-hours simulations, then
:>     you are obviously going to have sufficient swap such that your 
:>     simulation will never get the axe anyway.
:
:That's also objectively false.  Most such environments I've had
:experience with are, in fact, multi-user systems.  As you've pointed
:out yourself, there is no combination of resource limits and whatnot
:that are guaranteed to prevent `crashing' a multi-user system due to
:overcommit.  My simulation should not be axed because of a bug in
:someone else's program.  (This is also not hypothetical.  There was a
:bug in one version of bash that caused it to consume all the memory it
:could and then fall over.)

    Has your simulation ever been kicked by the kernel due to lack of
    swap space?

    I'm betting the answer is no.  

    You have to consider the probability of an event occuring, not just
    the possibility that the event might occur.  If the probability is 
    one in a million years, then it is not something you need to worry
    about relative to other things that, perhaps, you *should* be worrying
    about.

:>     It's easy to come up with potentials, but try assigning a probabilty
:>     to them and see how much they make sense then.  If you've been running
:>     thousand-hour simulations for 20 years and not a single one has ever
:>     been blown away due to the system running out of swap, then it obviously
:>     isn't an issue.
:
:And lastly, that is also objectively false.  Just because I haven't
:been screwed yet (and, in fact, I *have* been), that doesn't mean I
:won't be screwed in the future.

    The sky might fall tomorrow too, but you do not see me running around
    the room like a chicken with its head cut off.  Again, you are making
    the incorrect assumption that just because something *might* occur, it
    *will* occur.  Calculate the probability.  If the probability is not
    significant relative to other potential problems (like someone kicking the
    power cord out of the computer), then it isn't something you should be
    worrying about.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>