Subject: Re: -current kernel hangs on amd64
To: Andrew Doran <ad@netbsd.org>
From: Kurt Schreiner <ks@ub.uni-mainz.de>
List: current-users
Date: 10/11/2007 09:28:13
On Wed, Oct 10, 2007 at 10:34:45PM +0200, Andrew Doran wrote:
> On Wed, Oct 10, 2007 at 10:26:49PM +0200, Kurt Schreiner wrote:
> 
> > Not yet, I'll do that tomorrow (at work) and post the "results". The hang can
> > be reproduced very easy by just doing "build.sh -j 8 ..." (tried this afternoon
> > -after userland was build.sh'ed -j8 running a kernel with the old scheduler w/o
> > problems- output of bt was the same, but I've had no time to investigate further).
> 
> If it happens again, can you switch onto the other CPU (eg 'mach cpu 1') and
> get a backtrace there? It looks like the two CPUs are deadlocked.
I build a kernel w/ options LOCKDEBUG & DIAGNOSTIC as rmind@ suggested (had to
take out lpt to get it running) BUT:
The machine hangs really hard now:

[...]
Thu Oct 11 09:15:43 MEST 2007

NetBSD/amd64 (isunopti) (console)

login: p

[nothing after the "p"]

The "p" seems to be from trying to print "panic..." but that's it: no
db prompt and no reaction to "break" via serial console -> walk to
server room, press powerbutton... ;-)
The same happend last night on a P4 "dual"-processor (HT enabled): only
the "p" from "panic..." was displayed then powerbutton needed to get any
reaction for the system...

Would it make more sense to update my source to current -current before
trying again?

Kurt