Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

ongoing major problems with NetBSD-5 and LOCKDEBUG on multi-core system



So I was finally able to get a new server, and its a nice big Dell
PE2950 with 32GB RAM, lots-o-disk on a PERC-6/i, and a pair of zippy
Intel Xeon E5440 CPUs (quad-cores x2).

It's the first real hardware I've tried to do anything serious with
netbsd-5 on -- previously I'd only run netbsd-5 in VirtualBox (though
with two CPUs, on my iMac).

Everything looked OK during initial install of NetBSD-5, but during the
first real load test (build.sh -j 4) it crashed:  PR# 45827.

I've since seen this a bunch more times.  The machine is basically
useless because it seems to crash almost immediately under any decent
load.

Most recently I've been trying to use it to run sysinst to install to a
CF card that's connected via a USB reader.  The first attempt ended in
the middle of unpacking the man.tgz set with a similar crash to PR#
45827, and the second attempt ended in the middle of comp.tgz with:


panic: WARNING: SPL NOT LOWERED ON SYSCALL EXIT
LOCKDEBUGWARNING: SPL NOT LOWERED ON SYSCALL EXIT

WARNING: SPL NOT LOWERED ON TRAP EXIT
fatal breakpoint trapWARNING: SPL NOT LOWERED ON TRAP EXIT
 in supervisor mode
WARNING: SPL NOT LOWERED ON SYSCALL EXIT
trap type 1 code 0 eip c05cc4ac cs 8 eflags 246 cr2 bbb90000 ilevel 0
Stopped in pid 2751.1 (systat) at       netbsd:breakpoint+0x4:  popl    %ebp
db{4}> trace
breakpoint(c0bfe3da,dcc4bac8,c3398800,c04fcb9f,0,1,0,0,dcc4bac8,2) at 
netbsd:breakpoint+0x4
panic(c0b907a0,c0b8d6c7,c093658b,c0b90800,c04e06b2,1c62a60,0,8,1,c0d52cc0) at 
netbsd:panic+0x1b0
lockdebug_abort1(c0b90800,1,1,c0d5c120,0,0,c0d5c120,c3ba8db8,68,7fffffff) at 
netbsd:lockdebug_abort1+0xbb
rw_vector_exit(c0d52cc0,68,dcc4bc0c,c0534005,0,bbb909e0,68,1,c053eb38,dcc62a60) 
at netbsd:rw_vector_exit+0xc8
sysctl_unlock(0,bbb909e0,68,1,c053eb38,dcc62a60,0,18,1,dc81d7d0) at 
netbsd:sysctl_unlock+0x12
sysctl_dobuf(dcc4bca4,4,bbb55000,dcc4bccc,0,0,dcc4bc9c,dcc62a60,c336c980,0) at 
netbsd:sysctl_dobuf+0xc5
sysctl_dispatch(dcc4bc9c,6,bbb55000,dcc4bccc,0,0,dcc4bc9c,dcc62a60,c336c980,0) 
at netbsd:sysctl_dispatch+0xcf
sys___sysctl(dcc62a60,dcc4bd00,dcc4bd28,dcc4bd40,c05b8f02,c0d7dc20,ca,bfbfdf2c,6,bbb55000)
 at netbsd:sys___sysctl+0xd6
syscall(dcc4bd48,b3,ab,1f,1f,bfbfdf2c,bbb55000,bfbfde88,0,6) at 
netbsd:syscall+0x100
db{4}> 



I've also got what seems to be a 100% repeatable panic (cpu_switchto:
switching above IPL_SCHED) happening if I try to power the system down
with "halt -p".  I don't know if that is in any way related or not.


So, I've been wondering, is anyone else running netbsd-5 on a many-core
system with a LOCKDEBUG + DIAGNOSTICS + DEBUG kernel?


-- 
                                                Greg A. Woods

+1 250 762-7675                                RoboHack 
<woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>      Secrets of the Weird 
<woods%weird.com@localhost>

Attachment: pgprNgrkE_Pft.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index