Subject: Re: Multia machine check on reboot or halt
To: Tim Rightnour <root@garbled.net>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: port-alpha
Date: 06/08/1998 10:17:56
On Mon, 08 Jun 1998 05:26:23 -0700 (MST) 
 Tim Rightnour <root@garbled.net> wrote:

 > halted CPU 0
 > 
 > halt code = 5
 > HALT instruction executed
 > PC = fffffc0000300140
 > 
 > CPU 0 booting
 > 
 > 
 > Unexpected Machine Check through vector 00000067

Wow, a machine check while in the SRM ... "that doesn't make me feel so
good..."

[ SRM spew deleted ]

...more to follow...

 > >>>boot
 > (boot dka400.4.0.6.0 -flags a)
 > block 0 of dka400.4.0.6.0 is a valid boot block
 > reading 15 blocks from dka400.4.0.6.0
 > bootstrap code read in
 > base = 166000, image_start = 0, image_bytes = 1e00
 > initializing HWRPB at 2000
 > initializing page table at 158000
 > initializing machine state
 > setting affinity to the primary CPU
 > jumping to bootstrap code
 > 
 > NetBSD/Alpha Primary Boot
 > ._._._._._._._._._Jumping to entry point...
 > 
 > NetBSD/Alpha Secondary Boot, Revision 1.9
 > (cjs@bishop, Wed Dec 31 00:52:27 PST 1997)
 > 
 > VMS PAL revision: 0x1000000010530
 > OSF PAL rev: 0x1000000020123
 > Switch to OSF PAL code succeeded.
 > 
 > Boot flags: a
 > 
 > Loading netbsd...
 > 
 > Entering netbsd at 0xfffffc0000300fc0...
 > [ preserving 395912 bytes of netbsd symbol table ]
 > Copyright (c) 1996, 1997, 1998
 >     The NetBSD Foundation, Inc.  All rights reserved.
 > Copyright (c) 1982, 1986, 1989, 1991, 1993
 >     The Regents of the University of California.  All rights reserved.
 > 
 > NetBSD 1.3E (RIGEL) #0: Sun Jun  7 19:21:40 MST 1998
 >     root@rigel:/usr/src/current/src/sys/arch/alpha/compile/RIGEL
 > (PCI ISA), 167MHz
   ^^^^^^^^^
Wow, that's a new one... I haven't seen that before :-)

 > 8192 byte page size, 1 processor.
 > real mem = 50331648 (2424832 reserved for PROM, 47906816 used by NetBSD)
 > avail mem = 39149568
 > using 584 buffers containing 4784128 bytes of memory
 > mainbus0 (root)
 > cpu0 at mainbus0: ID 0 (primary), 21066 (pass 2)
 > cpu0: VAX FP support, IEEE FP support, Primary Eligible
 > lca0 at mainbus0
 > pci0 at lca0 bus 0
 > pci0: i/o enabled, memory enabled
 > unknown vendor 0x200d product 0x00c1 (prehistoric subclass 0xc1, interface
 > 0x20, revision 0x0d) at pci0 dev 0 function 0 not co
 > nfigured
 > [deleted about 50 of these]

This seems ... odd.  It sounds very much like you have buggy firmware.  Can
I ask which version you're running?

Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                            Home: +1 408 866 1912
NAS: M/S 258-5                                       Work: +1 650 604 0935
Moffett Field, CA 94035                             Pager: +1 650 428 6939