Subject: Re: LLSC MEM test?
To: <>
From: Ross Harvey <ross@ghs.com>
List: port-alpha
Date: 06/07/1999 10:32:14
Paul Mather <paul@gromit.dlib.vt.edu> writes:
> >>> test mem
> T-STS-MEM - LLSC Test: Addr 00800000 FWD Wr 00000000
> ? T-ERR-MEM - stl_c bcache miss with victim at Addr: 00d91bb8
> T-STS-MEM - Uncorrected Error count = 1
> ? T-ERR-MEM - FAILED status = 20 test Init addr = 00d91bb8
> ?? 810       MEM 0x0020
> [ ... ]

I don't have much insight into what the FW is doing or even into what that
failure message really means, but the apparent contradictions that you found
hard to resolve aren't too hard to speculate on... :-)

	1. You don't actually "*know*" the SIMMs aren't bad, you just have
	   a data point that says they tested good. Perhaps the 3000's are
	   at slightly different rev or eco levels, or some critical part
	   just happens to be faster on one. (There is a huge gap between
	   min and max prop delays, and the 3000 is built with very low
	   levels of integration, which means high relative uncertainty
	   between different siganls.)

	2. Like Chris said, maybe it's the bcache and not the DRAM.

	3. Perhaps the FW turns off ECC for the purposes of the test.
	   I would have. So, with NetBSD running, the errors are corrected.

	4. I would bet the framebuffer RAM error is independant, but not
	   necessarily. ECC won't deal with a completely stuck data line,
	   not unless the FW is really smart and manages to turn on the
	   `correct it without logging it' mode that at least some of the
	   alpha HW has.

Ross.Harvey@Computer.Org