current-users: Re: -current memory leaks?

Subject: Re: -current memory leaks?
To: Luke Mewburn <lm@melb.cpr.itg.telecom.com.au>
From: Irving Reid <irving@platform.com>
List: current-users
Date: 04/13/1995 11:05:56
> > Peter Seebach <seebs@solon.com>
> Luke Mewburn <lm@melb.cpr.itg.telecom.com.au>

> > Our -current system (about a week or two out of date, I guess, but
> > similar.) is exhibiting memory leakage; it runs out of memory and
> > panics after about 4 days.  We have 3-4 users average, 8-9 peak.
> > We have 8 megs of "real" memory and about 35 of swap.  As time goes
> > on, the swap space used goes from about 20% (fresh boot) to 50% or
> > so (a day or two) to 90% (3-4 days) and then crashes at 100%.  This
> > repeats.  We are using slip, but not PPP.

> > Are there any known memory leaks like this?  At the end of the several
> > days, there are no big processes; the total of process vm reported by
> > PS may be as high as 12 megs or so; the total reported by vmstat or
> > pstat -s will be in the 30's or 40's.

Yes!  I've been tracking this on my gateway machine for the last couple 
of days, and this is exactly the symptoms I've been seeing.  The kernel 
has also been slowly growing; in particular, the "vmstat-m" output 
shows the "VM object" and "VM pgdata" statistics slowly increasing, as 
well as the total swap space in use.

> The following item was extracted from some mail sent to
> current-users on on Thu, 10 Nov 1994 12:07:34 +1100.
> 
> -- start include --
>   > From: Andrew Herbert <andrew@werple.apana.org.au>
>   > Message-Id: <199411100107.MAA15227@eplet.apana.org.au>
>   
>   There's a nasty problem with 386bsd's vm code (also present in netrel2, mach
>   2.5 and even mach 3.0, I believe) where a vm_object can't be collapsed if it
>   has a pager.  pagers are typically allocated to previously pager-less
>   objects by the pageout daemon when memory runs short.  This inability to
>   collapse objects with pagers is quite serious because pager resources
>   (usually swap space) eventually run out as a result of an enormous string of
>   un-collapsible vm_objects, each gobbling some amount of swap.
>   ...

This sounds like exactly what's happening.  I killed and restarted a 
few of the processes that fork a lot (httpd, cron, smapd from the TIS 
toolkit), and all the sudden my swap space and kernel VM tables are 
completely back to normal.

Dare I hope for a fix?  I'd really rather not have to babysit this 
machine so closely, though I suppose I could write a cron job to kill 
and restart a few daemons.

 - irving -