tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: System goes into complete hang (Was: Is amd64 ready for the desktop?)




On 2-Apr-08, at 11:06 AM, D'Arcy J.M. Cain wrote:
On Wed, 2 Apr 2008 14:59:57 +0000
Herb Peyerl <hpeyerl%beer.org@localhost> wrote:
While I've totally not been paying attention to your problem or anything, but if you think the interrupt system is turned off (as opposed to something at a really high spl spinning), you might try twiddling something like DTR on a serial port... Com ports are usually pretty high priority so you might
get lucky and end up with some traction on the problem...

Isn't the keyboard pretty high too?  I can't even get the NumLock or
CapsLock to light up.

The keyboard interrupt never seems high enough up.  :-)

(On BSD/OS I seem to remember being able to get Ctrl-Alt-DEL to work even when everything else seemed to be hung solid, but I can't remember if it would only work when caps-lock would work or not.)

Anyway I think the first thing you should do is get a serial console working on that system. :-)

Then at least a BREAK condition should get you to DDB if there's any chance of it at all, i.e. if it's not a hardware problem or something where the hardware has been effectively disabled completely. Then along with DEBUG and DIAGNOSTIC and LOCKDEBUG it might help someone find the problem.

I'm wondering if you're now maybe seeing the same hang I've been seeing on my Dell server though. It hangs or crashes almost every night during the nightly cron jobs, though not 100% of the time, and even though it is otherwise completely idle at the time, yet it can run simultaneous build.sh runs (with -j6), big cvs-updates, and also big pkgsrc builds (e.g. mozilla, etc.) all at once and all without even the slightest hiccup. I haven't had the courage to boot a - current kernel on my Dell though. There were suggestions that it might be the SA stuff in netbsd-4 that was causing problems so I've been trying the wrstuden-fixsa branch of netbsd-4, but all that's done is change the symptoms of the hang a wee bit (now processes get stuck in vmmapva and wedge the whole system).

I guess I should try a -current kernel, but I really really really hate to do that on my main production file server and build host. I really need netbsd-4 to be stable and reliable too, so randomly trying -current isn't necessarily much help in finding the problem with netbsd-4, at least not for me.

--
                                        Greg A. Woods; Planix, Inc.
                                        <woods%planix.ca@localhost>





Home | Main Index | Thread Index | Old Index