Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: Garance A Drosihn <drosih@rpi.edu>
From: Matthew Dillon <dillon@apollo.backplane.com>
List: tech-userlevel
Date: 07/14/1999 18:29:50
:For the moment I'll pretend that you honestly think that is an
:answer, and I'll note that the very same machine may have well
:over 100 processes each of which takes 1-2 meg of memory.  If
:the machine hits a really-out-of-memory error, I would be much
:much happier to see all 100+ of those processes killed, at once,
:than the one 40-meg process.
:
:Now tell me how I fix my swap under those circumstances.  If
:the answer is "buy infinite memory (ram or disk)", then we don't
:need any overcommit policy in the first place.  Note that the
:problem might be that these 100 processes start taking up 5 or
:10 meg than the 2 meg I'm used to.

    Everything scales.  If the load on your machine is such 
    that you have hundreds of processes taking 1-2MB of memory,
    then lets assume that such a machine has a reasonable
    memory configuration of, say, 256MB of ram, and a reasonable
    swap configuration of, say, 1GB.  Under normal operating
    conditions perhaps 100MB might be swapped out, giving you
    900MB of margin.  The actual VM footprint on such a machine
    might run on the order of 10 GB (rough guess) of which 350MB 
    or so has actually been allocated).

    With 900MB of margin - which I might add is only about $30 worth 
    of disk space, and reasonable process limits, it seems highly
    unlikely that the machine will ever run out of swap, even
    if a user makes an honest mistake.  I also rather seriously
    doubt that a hostile user would have any more or less success
    blowing away your process with the non-overcommit model verses
    otherwise.

    If 1G isn't enough, spend another $30 and throw 2G of swap
    online.  Or perhaps dedicate an entire $150 disk and throw
    6+ GB of swap online.

    The equivalent setup using a non-overcommit model would require
    considerably more swap to have the same reliability.  Plus
    you have to realize that with either model if you are talking
    about saving your work, the same code that does the save-and-exit
    in the non-overcommit model can just as easily do a checkpoint
    once an hour in the standard overcommit model.  Code that
    can't save/checkpoint would not survive either model.

    Disk is cheap.  Memory isn't (though it's getting better).
    Everything scales.

:I didn't mean to be casting asperisions on the general idea of
:overcommitting, or whatever it is that has your shorts all tied
:up in a knot.
:
:---
:Garance Alistair Drosehn           =   gad@eclipse.acs.rpi.edu
:Senior Systems Programmer          or  drosih@rpi.edu

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>