Subject: Re: memory tester shows up swap/page tuning bug ...
To: None <Havard.Eidnes@runit.sintef.no, jmarin@pyy.jmp.fi,>
From: Sean Doran <smd@cesium.clock.org>
List: current-users
Date: 09/14/1996 18:54:18
[forgive the subject change and the obvious loss-of-focus-due-
to-frequent-interruption-of-connectivity-to-the-editor style problems]

| During the frozen state, the "Proc:" line looks like
| 
| Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow
|         1  1  9        14    9   85         9    2        objlk

Hm, is it always one process in page-wait?  Can you try this
with other active non-root processes, just to see?

| I see are a number of swapouts (usually 3) just as the machine freezes
| up, and then  burst of swapins/swapouts just as the machine
| un-freezes.

What kind of paging is being seen during all of this?

This behaviour kind-of looks like what can happen in large
memories when one is using a single-handed CLOCK algorithm;
you can end up waiting for a full revolution of the hand through
memory before you find something that can be paged out.

This was fixed in 4.3ish with the two-handed clock which
meant the time to sample a reference bit was the time it 
took the CLOCK scan to advance across the gap between the hands.

I don't see how the previous behaviour could be attributed
to the pager unless a process lucked out and was touching 
memory only in the gap between the two hands, and I don't 
see that happening for long periods of time, so this seems
like a wild goosechase...

I suppose it might be interesting to know what the pager daemon
was really doing during all of this, and I'd be resorting to
printfs at this point. :-)

| A demand-paged VM system should maintain a small pool of pages -- a
| few hundred, say -- at all times.  Consider what happens if we try and
| not do this.  I assume a VM system in normal operation has a
| "background rate" of pagefaults.  If the free-page set goes to zero,
| then *every* pagefault that's incurred has to force some resident page
| out of memory. The fault cannot be serviced until that's done, and the
| process incurring the fault is suspended in the meantime.

Yes, this is true.  As I said, spot the LISP person. :-)

Doesn't the pager daemon do this by paging things
out if there isn't at least desfree memory as
free pages?  

| If you don't keep sufficient free pages to satisfy ongoing faults,
| the system bogs down with processes blocked waiting for pages to be
| freed.  The observations I have definitely confirm this is happening
| during the "freeze" states.  They do *not* show whether that actually
| causes the freezes or if both are symptoms of another bug.
| 
| It would be mildly ironic if the swapper was, itself, running
| the freelist into the ground...

Could be?  Or the pager, maybe, with its asynchronous writes.

In the new VM is the number of swap buffers still constant?

Could it be that a huge amount of dirty pages being found by
the pager daemon causes a huge amount of pages wanting to write,
putting us in a cycle of the pager daemon queueing up a bunch of
pages to be pushed out to disk, blocking when the swap buffers are
all used; a swdone() calls being made as a page is
written out to disk, allowing the evil-allocator-process
to acquire the just-emptied page and the pager to queue
up another dirtied page for migration to disk?  If this is
happening, then perhaps this continues until the swapper
frees up a bunch of memory, allowing the cycle to start
over...?

This has me wondering if the sequential madvise stuff has
ever been implemented, actually...

	Sean.