Subject: Re: swap leakage?
To: Frank van der Linden <frank@fwi.uva.nl>
From: Chris G Demetriou <Chris_G_Demetriou@BALVENIE.PDL.CS.CMU.EDU>
List: current-users
Date: 11/05/1995 20:31:45
> > I think this problem is real, considering the number of people who have
> > reported it.
> 
> It certainly is real. It's an old problem in the VM code.

And, unless i'm mistaken, it was one of the first problems that the
FreeBSD folks fixed in their VM code.  (Granted the fix was a hack...
but which is better: "hacked" or "broken"?)

> If I remember
> correctly, it has to do with VM objects not being cleaned up when having
> a pager associated with them, and it occurs in situations where a
> child process exits while pages in the parent have been paged out.

yeah, in a nutshell, objects with pagers can't be collapsed, and
paging space for those objects can't necessarily be free until they're
collapsed or deallocated.  (I've included Andrew Herbert's explanation
of the problem, written a couple of years ago (!!), below, and i can
provide his demonstration program, etc., to anybody who wants it.)

The problem usually shows itself when you have a long-running server
process on a highly-loaded machine.  When the server forks to exec a
new program (for example), if it pages before the child has exec'd,
then swap space will typically be 'lost' until the the parent exits.


chris
=======
386bsd vm object system code...

There's a nasty problem with 386bsd's vm code (also present in netrel2,
mach 2.5 and even mach 3.0, I believe) where a vm_object can't be collapsed
if it has a pager.  pagers are typically allocated to previously pager-less
objects by the pageout daemon when memory runs short.  This inability to
collapse objects with pagers is quite serious because pager resources (usually
swap space) eventually run out as a result of an enormous string of
un-collapsible vm_objects, each gobbling some amount of swap.

Where does the string of shadow objects come from in the first place?  Each
time a process forks, the object handling it's data segment (for example) is
made into a shadow of two new objects which are given to the parent and the
child of the fork.  When the child exits, the parent still has a shadowed
vm_object.  If nothing in the parent has been paged-out, all is well - a call
to vm_object_collapse() will clobber the shadow object, since no-one else needs
it any more.  Otherwise, the problem described in the previous paragraph
begins.

I have written a program (to be run as root) that produces an vm_object-usage
dump for a given process.  Try it on the sendmail daemon on a heavily loaded
system and see what I mean by vm_objects not being collapsed...  As an example,
its output when run on sendmail is included in the file sendmail.vmobj.  If
you can't see similar behaviour on your system after inducing simultaneous
swapping and forking (e.g. with the included vm-test.c), please let me know!

Finally, I wrote an attempt at fixing the problem.  This is the reason I've
mailed you all this junk - I'd really appreciate any ideas you have on what
I have done wrong.  The basic strategy used is to copy all pages handled by
a shadow object's pager into the parent, and then collapse the shadow - if
the shadow only has the one reference.  Unfortunately something is broken in
my mods.  It seems like the get-page-from-pager code is returning the wrong
object, *and* the VM_WAIT call hangs the system.  The code is comprehensively
commented to explain what is going on, so please give it a look...

thanks,
Andrew Herbert <andrew@werple.apana.org.au> - apprentice VM hacker :)
11 April 1993