Subject: Re: MP?
To: None <bqt@update.uu.se>
From: Havard Eidnes <he@netbsd.org>
List: port-alpha
Date: 01/22/2004 17:00:28
> >  o the kernel panic you get is in ltsleep(), and seems to indicate
> >    that a sleep is done outside of a process context, i.e. curlwp i=
s
> >    NULL.  It would be interesting to see a stack backtrace to see
> >    where this happens.  I'm not sure if this is actually related to=

> >    the machine running with multiple physical CPUs (but failed to
> >    initialize the secondary CPUs).
>
> panic: spinlock_switchcheck: CPU 1 has 1 spin locks
> Stopped in pid 5.1 (ioflush) at netbsd:cpu_Debugger+0x4:        ret  =
   zero,(ra)

Hm, that's a different panic than the one you reported earlier,
which was:

panic: kernel diagnostic assertion "p !=3D NULL" failed: file "/usr/src=
/sys/kern/kern_synch.c", line 413
Stopped at      netbsd:cpu_Debugger+0x4:        ret     zero,(ra)
db{1}>

> db{1}> bt
> cpu_Debugger() at netbsd:cpu_Debugger+0x4
> panic() at netbsd:panic+0x1f8
> spinlock_switchcheck() at netbsd:spinlock_switchcheck+0xa4
> prologue botch: displacement 16
> frame size botch: adjust register offsets?
> mi_switch() at netbsd:mi_switch+0x58
> mi_switch() at netbsd:mi_switch+0x58
> db{1}>
>
> Not really pretty, I'd say.

I agree.  Not sure how useful that is.

I wonder, does it somehow think that the slave CPUs have started?  If
so, there may be something wrong with the error handling in the case
where they don't spin up.  ...and, indeed, cpu_boot_secondary() does
not have a return value, so if something goes wrong there, the rest of
the kernel is never told, and only the user is informed via the
console output.

The root problem, I suspect, is that your secondary CPUs don't spin
up.  Could you try with just two identical CPUs in the chassis and see
what happens?

Regards,

- H=E5vard