Subject: Re: couldn't ping cpus
To: matthew green <mrg@eterna.com.au>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-sparc
Date: 01/19/2003 19:32:36
On Sun, Jan 19, 2003 at 06:49:17PM +0100, Manuel Bouyer wrote:
> Well, it died as well, and again in the make obj. This can be because of the
> filesystem activity, or the high exec/fork/exit load.
> The box doesn't swap, the frozen top window says there is 22M RAM free.
> 
> Here his what I have on console:
> xcall(cpu0, f00078dc): couldn't ping cpus: cpu1
> simple_lock: locking against myself (this is probably the cause of the lookup
> without lockdebug)
> lock: 0xf01bbb88, currently at: sys_generic.c:996 on cpu0
> last locked: kern_sync.c:659
> last unlocked: kern_sync.c:637
> logwakeup+0x28
> printf+0xa4
> xcall+0x328
> sched_wakeup+0x1e0
> wakeup+0x84
> lwp_exit2+0x64
> exit2+0x64
> switchexit+0x3c
> cpu_exit+0x120
> exit1+0x128
> sys_exit+0x2c
> 
> the cpu1 stack trace:
> nmi_soft at nmi_sun4m+0x15c
> _simple_lock+0x150
> ltsleep+1b0
> kqueue_scan+0x264
> sys_kevent+0x1b4
> syscall+0x1f4
> 
> Let me know if you want something else, I can reproduce it easily.

I added a Debugger() call right after the "couldn't ping cpus" message.
Stack traces for both CPU are exactly the same.
>From ps, I guess there are a few processes runnable, including the reaper,
and one in state SDEAD.
I've not been able to know which one are on CPU.

I can't get a core dump.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 23 ans d'experience feront toujours la difference
--