Subject: Re: SS20/MP Watchdog Reset
To: None <port-sparc@netbsd.org>
From: Eduardo Horvath <eeh@NetBSD.org>
List: port-sparc
Date: 06/15/2004 17:26:33
On Mon, Jun 14, 2004 at 11:54:46PM +0200, Juergen Hannken-Illjes wrote:
> On this machine
> 
> 	total memory = 319 MB
> 	cpu0 at mainbus0: mid 8: TMS390Z50 v0 or TMS390Z55 @ 85 MHz, on-chip FPU
> 	cpu1 at mainbus0: mid 10: TMS390Z50 v0 or TMS390Z55 @ 85 MHz, on-chip FPU
> 
> running -current under heavy load I'm getting
> 
> 	Watchdog Reset
> 	cpu0: NMI: system interrupts: 400c0000<VME=0,SBUS=0,SC,T,ME>
> 	module0:
> 		mxcc error 0x0
> 		mxcc status 0xff1410002
> 		mxcc reset 0x0
> 	module1:
> 		mxcc error 0x0
> 		mxcc status 0xff1100000
> 		mxcc reset 0x4 (WATCHDOG RESET)
> 
> The Watchdog Reset is always on module1. Software or hardware?

Watchdog resets are caused by taking a trap when traps are disabled.

This particular fault is a level 15 interrupt.  I think the only
cause of level 15 interrupts are asynchronous memory errors.
Since traps should only be disabled inside trap handlers, you
are probably suffering from bad RAM.

Eduardo