current-users: paging questions

Subject: paging questions
To: None <current-users@NetBSD.ORG>
From: Mika Nystrom <mika@cs.caltech.edu>
List: current-users
Date: 03/08/1996 12:44:48
Hi people, 
   I have some questions about the virtual memory system that's in 
netbsd-current. I have looked at the sources and read the Mach paper
from 1987 about it, but I am still confused (possibly more than before
I looked at the documentation!) Anyhow, here is the Problem and some
related questions:
 
  I am running a system with a P120 and 32 or 64 megs of RAM, with
the OS partly on a local disk, partly over NFS, and swap on the local
disk (IDE drive).
  The systems are quite snappy UNTIL they run out of memory, and then they
just die. Here's a sample vmstat 1 run taken while doing calloc(1,10000*1024)
with a few hundred k to spare initially:

(8)stun4q:~>vmstat 1
 procs   memory     page                    disks         faults      cpu
 r b w   avm   fre  flt  re  pi  po  fr  sr ?0 ?1 ?2 ?3   in   sy  cs us sy id
 1 0 0210076 16500    6   1   1   1   0   2  0  0  0  0    0   22   5  0  0 99
 0 0 0210076 16496    6   1   2   1   0   1  0  0  0  0    1 1811 400  7 11 82
 0 0 0197528 16496    2   1   1   1   0   1  0  0  0  0    1  279  63  2  1 97
 1 0 0213916 13456  763   1   1   1   0   1  0  0  0  0    1  175  43  9 10 81
 0 0 0213916   480 3250 162   1   1   0 163  0  0  0  0    1 1392 366 31 59 10
 0 0 0213916   480    2   1   1   1   0   1  0  0  0  0    1 1423 320  5  6 89
 0 0 0213916   480    2   1   1   1   0   1  0  0  0  0    1  154  33  1  1 98
<calloc done here>
 010 0230304     0  553 262   1  99   0 789  0  0  0  0    1  149  38  8  8 84
 1 8 0234752   168  110  55   2  67   0 271  0  0  0  0    1   69  21  1  4 95
 1 9 0234752    56  224  77   2  69   0 336  0  0  0  0    1   46  17  3  5 92
 0 9 0234752     0  134  50   2  67   0 227  0  0  0  0    1   66  23  2  3 95
 1 8 0234752   284   49 292   2  71   0 482  0  0  0  0    1   45  16  1  5 94
 0 8 0230268     0  115  65   1  67   0 171  0  0  0  0    1   47  15  1  4 95
 0 5 0230268     0   90  12  89  67   0 166  0  0  0  0    1  168  37  1  3 96
 0 6 0213136     0   15  22  11  67   0  98  0  0  0  0    1  369  85  2  3 95
 0 5 0213136     0    2   1   1 136   0 136  0  0  0  0    1   60  19  1  7 92
 0 5 0213136     0    2   1   1 118   0 118  0  0  0  0    1   50  18  1  7 92
 0 5 0213136     0    3   1   1  80   0  80  0  0  0  0    1   45  16  0  2 98
 0 5 0213136     0    2   1   1 121   0 121  0  0  0  0    1   45  19  1  5 94
 0 6 3222020     0    3   1   1  71   0  71  0  0  0  0    1   45  21  0  3 97
 0 7 3222020     0    3   1   1  93   0  93  0  0  0  0    1   62  25  0  4 96
 0 7 3222020     0    2   1   1  82   0  82  0  0  0  0    1   45  19  0  3 97
 procs   memory     page                    disks         faults      cpu
 r b w   avm   fre  flt  re  pi  po  fr  sr ?0 ?1 ?2 ?3   in   sy  cs us sy id
 0 8 3222020     0    4   2   1 119   0 119  0  0  0  0    1   48  21  0  5 95
 0 8 3204320     0    2   1   1 134   0 134  0  0  0  0    1   68  23  0  9 91
 0 9 3179008     0    3   1   1 134   0 134  0  0  0  0    1  114  33  2  6 92
 010 3179008     0    3   1   1 134   0 134  0  0  0  0    1   24  16  0  6 94
 010 3179008     0    2   1   1  99   0  99  0  0  0  0    1   21  14  0  4 96
 012 3174372     0  142 1449 141 159   0 1747  0  0  0  0    1   21  25  0 13 87

etc. I did the calloc() when there was 480 k to spare. The system freezes up
nicely within a few seconds and doesn't come back until, oh, about another
screen later (and then it hangs again after a few seconds). In the interim, 
the drive light flashes about once a second but it doesn't sound like 
it's doing anything terribly useful (it goes "whirr" for a bit rather 
than the expected constant "whirrr-whirrr-whir-whir-whirr"). 
I fiddled with cnt.v_free_min and cnt.v_free_target (using
gdb -k -w netbsd.gdb /dev/mem), but it doesn't seem to have helped. I was
hoping to page out larger chunks but no go for that..

Under "normal" operation, vmstat looks like this:

(10)stun4q:~>vmstat 1
 procs   memory     page                    disks         faults      cpu
 r b w   avm   fre  flt  re  pi  po  fr  sr ?0 ?1 ?2 ?3   in   sy  cs us sy id
 1 0 0129416  1940    6   1   1   1   0   2  0  0  0  0    0   22   5  0  0 99
 0 0 0129416  1932    6   1   3   1   0   1  0  0  0  0    1   50  11  0  1 99
 0 0 0124816  1932    2   1   1   1   0   1  0  0  0  0    1   35   8  0  0 100
 0 0 0124816  1932    2   1   1   1   0   1  0  0  0  0    1   29   6  0  0 100

i.e., sr is 1 just about always. Also the operating system hardly ever swaps
out processes (even those that have been idle for >12 hours). I don't exactly
know what to expect, since I am most used to SunOS 4.1 which has completely
different VM management. Btw, my testing malloc behaves much better 
under SunOS 4.1.4, FreeBSD2.1.0-RELEASE and even Linux 1.3.71. (The drives
are much more active immediately.) It seems as if my kernel is "almost"
deadlocking somehow.. ho hum.

Also, is there a way to swap to files (e.g., swapping to a file over NFS,
or for that matter adding a swapfile in an FFS filesystem while the system
is up without using bootp (does that even work?)?) And iostat, is that 
broken just like netstat or am I not giving it the right flags?


   Regards, 
     Mika
     <mika@vlsi.cs.caltech.edu>