Subject: Re: couldn't ping cpus
To: Paul Kranenburg <pk@cs.few.eur.nl>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-sparc
Date: 01/20/2003 00:26:59
On Sun, Jan 19, 2003 at 11:57:33PM +0100, Paul Kranenburg wrote:
> > Well, it died as well, and again in the make obj. This can be because of the
> > filesystem activity, or the high exec/fork/exit load.
> > The box doesn't swap, the frozen top window says there is 22M RAM free.
> > 
> > Here his what I have on console:
> > xcall(cpu0, f00078dc): couldn't ping cpus: cpu1
> > simple_lock: locking against myself (this is probably the cause of the lookup
> > without lockdebug)
> > lock: 0xf01bbb88, currently at: sys_generic.c:996 on cpu0
> > last locked: kern_sync.c:659
> > last unlocked: kern_sync.c:637
> > logwakeup+0x28
> > printf+0xa4
>   ^^
> 
> This explains the lockup. It should be a printf_nolog() as it's dangerous
> to use plain printf() with schedlock held.

OK, there are still plain printf() in xcall

> 
> That leaves us the hunt for the cause of the xcall time-out that started it
> off in the first place.

Can't the second trace give a hint ?
It entered _simple_lock(), and then entered nmi_soft() ...
Or is the NMI triggered by the xcall ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 23 ans d'experience feront toujours la difference
--