current-users: Re: current got 'swappier'/slower.

Subject: Re: current got 'swappier'/slower.
To: None <current-users@netbsd.org>
From: Kurt Schreiner <ks@ub.uni-mainz.de>
List: current-users
Date: 01/06/2004 12:49:10
Hi,

just another observation I make since a few days:

System is a p3/900 notebook w/ 1G RAM, 2 disks (40G + 80G).

Freshly booted 'top' shows something like:

load averages:  0.58,  0.49,  0.35
62 processes:  61 sleeping, 1 on processor
CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Memory: 75M Act, 3912K Wired, 13M Exec, 9408K File, 863M Free
Swap: 1292M Total, 1292M Free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
  231 root       2    0  9032K 8464K select     0:04  0.00%  0.00% XFree86
  606 ks        28    0   192K  940K CPU        0:00  0.00%  0.00% top
  415 root      30    0    10M   10M segment    0:00  0.00%  0.00% lfs_cleanerd
   17 root      18    0     0K   39M syncer     0:00  0.00%  0.00% [ioflush]
  176 ntpd      18    0  1028K 2964K pause      0:00  0.00%  0.00% ntpd
  .......

watch [ioflush] starting w/ 39M.

Then, 4 minutes later after having 'cvs update'd in /usr/src you see:

load averages:  1.83,  1.07,  0.62
65 processes:  64 sleeping, 1 on processor
CPU states:  5.0% user,  0.0% nice,  6.5% system,  0.0% interrupt, 88.6% idle
Memory: 154M Act, 3992K Wired, 13M Exec, 88M File, 684M Free
Swap: 1292M Total, 1292M Free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
  161 ks         2    0   384K 1904K select     0:01  2.10%  2.10% sshd
  231 root       2    0  9032K 8464K select     0:04  0.00%  0.00% XFree86
  606 ks        28    0   240K  988K CPU        0:00  0.00%  0.00% top
  415 root      30    0    10M   10M segment    0:00  0.00%  0.00% lfs_cleanerd
   17 root      18    0     0K  133M syncer     0:00  0.00%  0.00% [ioflush]
  176 ntpd      18    0  1028K 2964K pause      0:00  0.00%  0.00% ntpd
  ........

watch [ioflush] now using 133M!

I don't know precisley where 'top' gets this numbers from, but
after doing some 'rsync <disk0:dir> <disk1:dir>' can see the kernel
tasks eating up nearly all mem (around 870MB last night). In rare
cases (don't know what triggers this) some few precent of mem are
freed again. Doing "enough" 'rsyncs' I can get the system to crash
(sorry, no crashdump, I was on an X-Display an couldn't switch back to
wscons).
The behaviour seems to depend on the setting of kern.maxvnodes:
setting this to 2M(!) I experienced the crashes. Last night I played
around with set to 1M and couldn't crash the machine. I've tested
kernels both w/ option NEW_BUFQ_STRATEGY and w/o, but that doesn't
seem to make a difference... 

As the usage of kernel memory goes up, process space is swapped to
disk - and the system becomes really sluggish as others have
observed.

I think there's a lot of mem used up somewhere but never freed?

BTW: running 'top' in an another window you can watch the mem-figures
for kernel processes going up really fast...

Hope this observation is of some use in debugging this problem. If
I could do some more testing or such, just say so...

-Kurt