Subject: Re: "couldn't ping cpus" error when starting X
To: Julian Coleman <jdc@coris.org.uk>
From: Tillman Hodgson <tillman@seekingfire.com>
List: port-sparc
Date: 05/15/2005 15:04:48
(I've cc'ed the list because I'd really like to get assistance from
 anyone who might be able to help. My apologies in advance for the
 breach in netiquette.)

On Thu, May 05, 2005 at 09:41:43AM +0100, Julian Coleman wrote:
> > ares# cpu0: NMI: system interrupts: 40000000<VME=0,SBUS=0,ME>
> > module0:
> >         mxcc error 0x0
> >         mxcc status 0xff1410002
> >         mxcc reset 0x0
> > module1:
> >         mxcc error 0x0
> >         mxcc status 0xff1402000
> >         mxcc reset 0x4 (WATCHDOG RESET)
> > 0tore bType  'ogo' to  resuure me
> >  T0yp01e .  htrelp .. foLevel 15 Interrupt
> > Type  help  for more information
> 
> The only time I've seen errors like this was when the CPU modules in my 20
> died.  One was completely dead and the other wouldn't run MP - when I replaced
> the dead one, I would see errors like this every so often.  After swapping out
> the other CPU module, I didn't see these errors again.  Have you got a spare
> CPU module to put in?

Unfortunately, no. Not unless mismatched CPU modules will work with
NetBSD (I'v enever been clear about that, though the idea seems dubious
;-)).

I've seen it once more since then. The box will run for weeks or months
between occurances, though at the time it would last hours at most. I
also have two boxes running into this issue. Oddly, both were running
with the case top off at the time ... perhaps a heat-related issue?

-T


-- 
Lonny:   "What's that command to add something to SysV init?"
Tillman: "c h k tab tab"
Lonny:   "chkconfig --add!"
Tillman: "Cool. I just tab-completed Lonny's brain."