Subject: re: SS10 unstable under high load (SMP)
To: Atte Peltomaki <atte.peltomaki@iki.fi>
From: matthew green <mrg@eterna.com.au>
List: port-sparc
Date: 07/13/2007 09:00:03
   On Tue, Jul 10, 2007 at 09:42:41AM +0300, Atte Peltomäki wrote:
   > I have two SPARCstations; SS5 and SS10. Both are running NetBSD 3 with
   > jdc@'s raidframe patches(1*). The SS5 is UP, and stable. The SS10
   > however, is very unstable under high load, subject to hang and/or panic
   [...] 
   > However, after compiling a UP kernel things seem to work just fine. I'm
   > typing this on the problematic box with UP kernel, network constantly
   > blowing ~220kB/sec down and CPU constantly between 60 and 100 percentage
   > use. Apart from a little sluggishness involvend with using mutt and vim
   > with syntax highlighting, the box seems to work fine. 
   
   Updated information: 
   
   Since the writing of this, I have managed to reproduce the bug on both
   SS5, and SS10 using UP kernel. All it seems to take is to fire up enough
   python bittorrent clients, which cause both CPU and the Lance to get
   plenty of load. 
   
   I'm now compiling kernel with 'options LOCKDEBUG', and I believe this is
   related to the more serious stability problems people are experiencing
   with L2-cacheless MP-setups, and the one newlock2 related 4.0_BETA PR. 
   
   Btw. forcing a crashdump on the SS5 simply says "Bad dump device" while
   on the SS10 it crashes (see screenshots from prev. post). I'm positive
   the dump device is configured correctly. 


interesting;  can you try compiling python with GNU pth instead of
native pthreads?  they are a known problem on 4.0/sparc...


.mrg.