Subject: Re: Virtual Memory Subsystem
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: John S. Dyson <toor@dyson.iquest.net>
List: port-i386
Date: 11/26/1996 08:09:13
> 
> >Well, it doesn't always seem to matter what the end-users think.  There are
> >a few annoying misfeatures in NetBSD that (IMHO) should have been fixed on
> >day #1, but haven't.  This VM thing may not be fixed for another 3 years, I
> >guess.  I would also appreciate it if my machines didn't come to a complete
> >halt when I run out of RAM.
> 
> 
> I don't understand. 
> 
> There is a fix for that in -current, and has been, I beleive, since
> before 1.2 was released. I hope it goes in as an official patch for
> 1.2.  The relevant change was from 1.23 to 1.24 of
> sys/vm/vm_pageout.c. pr #2755 explains the problem.
> 
This mitigates some of the performance problems when running out of memory,
but doesn't help the swap leak problem.  Unfortunately, I have my mail
reader setup to read mail backwards, and thought that you were talking about
swap leaks (collapse problems.)

The original Lite/2 VM system has many serious problems, and needs someone
to fix the problems in both the NetBSD and OpenBSD camps.  The VM system
is NOT that complicated.  (Esp. since most of the problems have already
been fixed in FreeBSD, and only portability issues need to be visited now.)

Please note that the tsleep mod that I had suggested in vm_pageout (and
adopted into NetBSD) is not a "solution" to a problem, but only a quick
fix workaround until someone takes the general problem on.

1) Say away from simple clock algorithms, but a working clock algorithm is
   better than the Lite/2 code.
2) Allow the pageout daemon to block at the right times.  (If the
   paging device is overloaded, it doesn't help to start freeing
   useful pages.)  -- this is partially addressed by the tsleep mod.
3) Note that the original code is using a dreaded FIFO algorithm,
   throwing away alot of useful info on architectures that support
   the reference bit.  A hybrid algorithm is possible that would support
   hardware that goes to the trouble of providing reference info.

There may be some problems with reference bit handling in the rest of
the code, so be careful if you modify the pageout daemon to also make
sure that the reference bit is managed correctly.  (I seem to remember
problems in that area.)  It might be the reason that the pageout daemon
uses the wasteful algorithm in moving active pages to the inactive queue.
(It by itself could be a bug work-around.)

VM bugs appear to be common across OSes though, I read some info recently
(I think on USENET) about bugs in MacOS -- very interesting...  Also, there
is a borderline silly (and short) paper from Microsoft defending FIFO.

John