current-users: Re: memory tester shows up swap/page tuning bug [was Re: BUFFERCACHE, PR 1903]

Subject: Re: memory tester shows up swap/page tuning bug [was Re: BUFFERCACHE, PR 1903]
To: <>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: current-users
Date: 09/15/1996 03:17:59
[[ The following got sent to Sean, Havard, and Jukka Martin at
  Date: Sat, 14 Sep 1996 18:10:45 -0700, , but the Cc: to
   current-users and Laine never went out, so I'm re-sending it...]]]


Executive Summary: I think my wild guess is at least as accurate as
Sean's, but we may both be partially right.  "systat vmstat" shows
whatever is causing the freezes is *not* disk saturation, and that
the free-page count does go to zero during the freeze periods.




>| The following shar'ed program allocates and touches large amounts of
>| physical memory.

>This should be committed as /etc/chill. :)

That should be /usr/sbin/chill ;).  If we care about VM performance
under overload conditions, shipping some kind of memory hog isn't such
a bad idea.

I've run a memory-toucher on a machine where it, and the rlogind or
xterm running it, are the *only* active processes.  I don't see why
swapping should be invoked in that scenario.  I also added a flag to
the memory-toucher to put it in an infinite loop.  It runs touching
until it hits the pathological regime (after touching 50 Mbytes), at
which point the machine goes catatonic for a minute. When the machine
un-freezes, the memory toucher runs smoothly (as do all other user
processes) until it's touched another 16 Mbytes. The machine then
freezes again.

The "run for another 16 Mbytes, then freeze" cycle repeats forever.
At first glance,   Sean's swapout/swapin-in-entirety  scenario is not
consisent with that behaviour.

>Use kgdb and stare at the same figures that systat stares at
>to report the memory figures in the top left of the vmstat
>display?

(sigh) The ports I use don't have kgdb.

Some good news is, if I run the memory-toucher as an unprivileged
user, and run systat vmstat as root, I *do* get to see systat numbers.
The "free" count does go to zero, *exactly* during the "frozen"
states, and not otherwise.  Nothing else is running on this machine,
the only disk activity on this machine is due to paging and swapping.
Disk activity is *lower* during the "frozen" state than the non-frozen
state, by a factor of 20. (It's almost exactly 64 tranfers/sec, 256
Kbytes/sec druing the entirety of the frozen state.)

That correlates much better with my observation that the disk sounded
mostly silent, and fits much better with my hypothesis than Sean's.

During the frozen state, the "Proc:" line looks like

Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow
        1  1  9        14    9   85         9    2        objlk
                                                          objht
   0.0% Sys   0.0% User   0.0% Nice 100.0% Idle           zfod
|    |    |    |    |    |    |    |    |    |    |       nzfod

During the non-frozen states, the CPU is approximately 30% user time,
60% idle, which is consistent with the memory-toucher spending most
of its time thrashing.

I see are a number of swapouts (usually 3) just as the machine freezes
up, and then  burst of swapins/swapouts just as the machine
un-freezes.

>Maintaining a pool of memory to satisfy large short-term demand
>doesn't seem attractive at first glance, since it reduces
>available memory in conditions where the set of active 
>pages would fit in physical memory if the spare pool wasn't
>there.   To me, this looks like forcing lots of really unnecessary
>paging to work around what appears to be a bug in the swapper.

A demand-paged VM system should maintain a small pool of pages -- a
few hundred, say -- at all times.  Consider what happens if we try and
not do this.  I assume a VM system in normal operation has a
"background rate" of pagefaults.  If the free-page set goes to zero,
then *every* pagefault that's incurred has to force some resident page
out of memory. The fault cannot be serviced until that's done, and the
process incurring the fault is suspended in the meantime.

If you don't keep sufficient free pages to satisfy ongoing faults,
the system bogs down with processes blocked waiting for pages to be
freed.  The observations I have definitely confirm this is happening
during the "freeze" states.  They do *not* show whether that actually
causes the freezes or if both are symptoms of another bug.

It would be mildly ironic if the swapper was, itself, running
the freelist into the ground...

--Jonathan