Subject: Re: Processor correctavke error?
To: Chris G. Demetriou <cgd@pa.dec.com>
From: Michael T. Stolarchuk <mts@rare.net>
List: port-alpha
Date: 06/11/1998 09:19:37
In message <5753.897504186@dnaunix.pa.dec.com>, "Chris G. Demetriou" writes:

>I think the PC and RA can be interesting, but may not necessarily be
>in the case of the problems you're looking at.  (In general, I've
>found over my experience with the port that machine checks or other
>faults that reference locations in locore and other lowest-level fault
>handling code are often indicators of unrelated problems, that happen
>to trigger asynchronously in those places...  That's how I look at
>them.  It's perfectly fine that you have your own take on how they
>should be looked at, however.)
>

ok, here's another machine check. This one comes from one of the EB164's
i'm playing with, all of them exhibit the same machine check.  It occurs
when working with the second aha2940...


    ...
    ahc_pci_probe:
    ahc1 at pci0 dev 9 function 0


    unexpected machine check:

	mces    = 0x1
	vector  = 0x670
	param   = 0xfffffc0000006068
	pc      = 0xfffffc0000495500
	ra      = 0xfffffc00004954e0
	curproc = 0xfffffc0000520058
	    pid = 0, comm = 

    panic: machine check
    Stopped at      Debugger+0x4:   ret     zero,(ra)
    ..

so mces says that the error is uncorrectable...

pc:..499500 is in cia_swiz_mem_read_1...

i've got ddb to answer, but it doesn't give a trace... which register should i
use as a base for the trace? 

db> trace
db> show registers
v0                         0x7
t0                         0x1
t1          0xfffffc0000519e00  __bss_start+0x18
t2          0xfffffc001fffde4e
t3          0xfffffc001fffde6e
t4          0xfffffc001fffc000
t5          0xfffffc00004f2660  cfdata+0x48
t6          0xfffffc00004f3500  cfdata+0xee8
t7                         0x6
s0                       0x100
s1          0xfffffc00004ece40  mcclock_tlsb_busfns+0x4a0
s2          0xfffffc00004ecec0  mcclock_tlsb_busfns+0x520
s3          0xfffffc000062f748  end+0x2f2d0
s4                       0x670
s5          0xfffffc0000006068
s6                         0x4
a0                         0x7
a1           0x7ffffe42c0003f8
a2                         0x2
a3                         0xd
a4                         0x8
a5          0xfffffc000051c3a8  __bss_start+0x25c0
t8                        0x1f
t9          0xfffffc000035f29c  sprintf+0x5dc
t10                          0
t11                        0xa
ra          0xfffffc000035dcd8  panic+0xb8
t12         0xfffffc00004b7870  Debugger
at                         0x4
gp          0xfffffc00005145c0  mountroot+0x8008
sp          0xfffffc000051c3a8  __bss_start+0x25c0
pc          0xfffffc00004b7874  Debugger+0x4
ps                         0x7
ai                         0xa
pv          0xfffffc00004b7870  Debugger
Debugger+0x4:   ret     zero,(ra)
db> 

mts.