Subject: Re: horrible raidframe performance on 2.0-RC5
To: Egervary Gergely <egervary@expertlan.hu>
From: Frederick Bruckman <fredb@immanent.net>
List: current-users
Date: 12/05/2004 10:08:52
In article <20041204000926.CC25955C00@cs.usask.ca>,
	oster@cs.usask.ca (Greg Oster) writes:
> Egervary Gergely writes:
>  
>> the fs then mounts okay, but copying files onto it takes forever,
>> and while it's copying, the box locks up, sometimes even pinging the
>> machine, or get into the CRTL-ALT-ESC debugger is impossible.
> 
> How much RAM in these boxes?  This is sounding like kernel RAM 
> contention.  I'm not sure why copying files would take forever 
> though... (that would sound more like a disk timeout..)

For what it's worth, I'm not having a problem with RAIDFRAME on my
Dual Processor Compaq W8000 (1GB RAM, 2 18GB Cheetah's, 1 Atlas 36GB),
which has been tracking the RC's, and was just upgraded to NetBSD 2.0.
I *have* seen slowdowns under heavy I/O, but not just to the RAID set.
The X cursor doesn't move. If I manage to switch to a virtual console,
the shell is fine. I think the last minute change to RC5 made it a 
little better. In any case, it's more of quirk than a real issue.

On the other hand, I have a heavily tasked K6-2 (with only 256MB RAM)
mailserver/fileserver/printserver that had become unusable after only
a few days of uptime with some earlier RC. You could log in and start
a shell, but many simple programs would just hang. (Presumably, the
shells already have all their pages in memory?) It would take repeated
tries just to invoke the rc scripts to shut all the services down, and
then still a few would have to be stopped with "kill -9". Exiting the
single-user shell back to multi-user mode would yield a few more days
of up-time (without actually rebooting).

It's been up for 34 days now, running 2.0_RC4, and all I had to do, was
to up NKMEMPAGES to 32768 (that's twice the default of 1/4 total RAM).
"top", right now, shows that the kernel threads are using 74M, which is
more than would have been allowed under the default. I have another box,
a 486 with only 48M RAM, which is now only a dial-up gateway, but also
needs NKMEMPAGES bumped to 1/2 of total RAM just to get to "init".

I wonder that more folks don't see such problems. Perhaps others are
quicker to upgrade chronicly underpowered computers. In any case, "top"
or "vmstat -m" will tell you if your utilization of kernel pages is
close to the limit.


Frederick