Subject: Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
To: Daniel C. Sobral <dcs@newsguy.com>
From: David Scheidt <dscheidt@enteract.com>
List: tech-kern
Date: 07/16/1999 17:42:57
On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

> Technical follow-up:
> 
> Contrary to what I previously said, a number of tests reveal that
> Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

      +    Requests for more paging space will fail when they cannot be
           satisfied by reserving device, file system, or memory paging,
           even if some of the reserved paging space is not yet in use.
           Thus it is possible for requests for more paging space to be
           denied when some, or even all, of the paging areas show zero
           usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt