Subject: SMP stability issues
To: None <tech-smp@netbsd.org>
From: Chris Rendle-Short <jim@tty1.rr.nu>
List: tech-smp
Date: 11/10/2006 19:27:22
Hi,

For the last couple of months I've been running NetBSD 3.0.1 and 3.1 (since yesterday) on an Abit VP6 SMP motherboard with two P3 866's. The system is mainly used as a mail, web, and Samba server, along with occasional other odd tasks.

When I run off the GENERIC kernel, the machine is rock solid stable. However, when I use either GENERIC.MP or my own kernel (which is basically GENERIC.MP with pcmcia and sound support removed), it invariably locks up after a time running. It is a hard lockup, nothing will revive it other than hitting the reset switch.

The uptime before the lockup has so far varied between about 1 hour and 6 days. There doesn't seem to be any pattern to it, other than the fact that it only happens when running an SMP kernel. I can't find anything in the logs to give any clues.

I'm pretty sure it's not a hardware fault, as I've tested everything I can think of. Added to that, prior to running NetBSD the box ran Linux (in SMP mode) without any problems (uptime was 193 days when I took it down to install NetBSD). The root filesystem is on RAIDFrame, if it makes any difference.

Does anyone have any ideas about what could be causing this, or any troubleshooting clues? Needless to say, it's a very irritating problem.

Thanks in advance,
Chris.