tech-kern: Re: userid partitioned swap spaces.

Subject: Re: userid partitioned swap spaces.
To: None <tech-kern@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: tech-kern
Date: 12/17/1998 00:45:53
[ On Thu, December 17, 1998 at 14:46:24 (+1030), Ian Dall wrote: ]
> Subject: Re: userid partitioned swap spaces.
>
> I think demand paged virtual memory was optional before V4. Some vendors
> made purely swapping systems a la edition 7. I don't know what kind
> of system 3B2's had. In any case, the SysVr2.2 system I was familiar with
> only allowed 16MB virtual memory so "huge sparse arrays" were not really
> an option.

I had thought VM was in SysVr2.2 on VAX and 3B2, but I can't remember
for sure.  There were certainly lots of systems with SysVr2.2 user-land
stuff that did not have VM.  Although I've used 3B2's running 2.2, that
was a *long* time ago, and I never got too far inside of them back then.

However 3B2 SysVr3.0 and subsequent certainly did have VM (and the 3B2
was the primary development platform for SysV from 3.0 through 3.2, and
what you got when you licensed source code (though I think by 3.1 there
was an option to get the i386 source tape, even though most i386 vendors
were getting their source from ISC by then, which was a somewhat
independent port).  I've been inside most of those releases, though not
in the VM stuff -- mostly just the drivers.

Although SysVr4 was first developed on 3B2's AT&T attempted to get lots
of independent hardware manufacturers to do "reference" ports, however
not every hardware vendor used the reference port, or at least didn't
stick very close to it, and some SysVr4 platforms deviated quite a long
ways from what AT&T was shipping, SunOS-5 being a primary example (and
Commodore's Amiga port being the least deviant independent port I know
of, and Pyramid DC/OSx being perhaps the second least deviant one, at
least in user-land [the DC/OSx kernel is quite a bit different]).

> Possibly. I think of two paridigms. In the first RAM is a cache for
> disk and in the other disk is overflow storage (backing store) for
> RAM. I tend to think of the latter as being the "overcommit" strategy
> since the former never allows overcommitting, but now I think about
> it, that is not quite right. The overflow strategy doesn't *have* to
> allow overcommitting so long as all virtual memory is mapped to either
> physical memory or to swap.

Well, in all true SysV systems with VM that I've ever used or
administered, the system always allocated a full set of swap pages for
all VM pages, regardless of whether those pages were ever touched by the
process.

> SysVr2.2 certainly had COW fork which means processes can require more
> swap at some arbitrary point when they touch a page, not just when
> they sbrk. Of course it is *possible* to allocate physical memory or
> swap to cover the worst case, but it is not my understanding of how
> things worked.

The only systems I remember that had COW fork also had VM!  ;-)

However all SysV systems with COW that I ever used also always allocated
swap pages for all VM pages, regardless of whether they were ever
touched and thus physically allocated in RAM and used.

I'm pretty sure all of this is covered clearly and in some detail in
Bach's "The Design of the Unix Operating System" (did I get that
right?).

Where this is relevant to 4.4BSD seems to be that at least with the
original Mach VM it is "difficult" to do all the accounting necessary to
determine just how many pages a process might be able to touch, and thus
they didn't get around to implementing it all (part of the complexity
seems to come from mmap if I undersand correctly).  The result is that
it's possible for the system to over-commit VM for which it cannot
physically store on the swap disk(s) should some or all processes decide
to touch every page they've allocated.  As the paragraph I quoted from
McKusick et al said, the system must then reneg it's commitment and try
to kill some process(es) in an attempt to reduce its VM commitment to a
point where it has enough swap pages available.  Killing the biggest
process isn't going to help if that process hasn't actually used many
pages that are taking up swap space, which is why I think the system
sometimes just hangs.

I'd love to hear some comments from those who know more about UVM as to
whether or not it's going to be easy/hard/possible to add the necessary
VM accounting to it, if indeed it's not already there, and if it might
be possible to have a flag to turn on and off the over-commit "feature",
if not on a per-process basis (or per-user basis, ala the subject of
this thread), then at least on a system-wide basis.

I'd also like to see some discussion as to whether or not the SIGDANGER
solution might work reliably for UVM....

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>