Subject: Re: SS10 unstable under high load (SMP)
To: None <port-sparc@NetBSD.org>
From: Atte Peltomaki <atte.peltomaki@iki.fi>
List: port-sparc
Date: 07/12/2007 09:48:37
On Tue, Jul 10, 2007 at 09:42:41AM +0300, Atte Peltomäki wrote:
> I have two SPARCstations; SS5 and SS10. Both are running NetBSD 3 with
> jdc@'s raidframe patches(1*). The SS5 is UP, and stable. The SS10
> however, is very unstable under high load, subject to hang and/or panic
[...] 
> However, after compiling a UP kernel things seem to work just fine. I'm
> typing this on the problematic box with UP kernel, network constantly
> blowing ~220kB/sec down and CPU constantly between 60 and 100 percentage
> use. Apart from a little sluggishness involvend with using mutt and vim
> with syntax highlighting, the box seems to work fine. 

Updated information: 

Since the writing of this, I have managed to reproduce the bug on both
SS5, and SS10 using UP kernel. All it seems to take is to fire up enough
python bittorrent clients, which cause both CPU and the Lance to get
plenty of load. 

I'm now compiling kernel with 'options LOCKDEBUG', and I believe this is
related to the more serious stability problems people are experiencing
with L2-cacheless MP-setups, and the one newlock2 related 4.0_BETA PR. 

Btw. forcing a crashdump on the SS5 simply says "Bad dump device" while
on the SS10 it crashes (see screenshots from prev. post). I'm positive
the dump device is configured correctly. 

Will post more info as it occurs.

-- 
Atte Peltomäki
http://kameli.org