Subject: Re: CPU memory read ECC error
To: Aaron J. Grier <agrier@poofy.goof.com>
From: Simon Burge <simonb@netbsd.org>
List: port-pmax
Date: 08/15/1999 11:02:05
"Aaron J. Grier" wrote:

> I recently filled my 5000/240 with gobs of 8MB modules...  that was four
> days ago.  I tested them with the console tester before I fired
> everything up, and they all tested good.  However, now I'm getting:
> 
> goldberry:/var/log# grep ECC messages | cut -c36-80 | grep CPU | sort | uniq
> CPU memory read ECC error at 0x04403c98
> CPU memory read ECC error at 0x0442dc98
> CPU memory read ECC error at 0x04473c98
> [etc...]
> 
> FWIW, if I subtract the highest address from the lowest one, assuming
> these are byte-indexed addresses, then they all fall within an 8MB
> range.  My hope is that I have a single module that I can fix by
> wiggling it a little bit.  (I know I've had to do that before...)
> 
> How do I map these to physical addresses so I can figure out which
> memory module(s) to wiggle?

RAM starts at 0 - 0x04473c98 ~= 8.5 * 0x800000, so wobble or throw
out the 9th module from the back (I think that /240's have the slots
numbered).  I've also had success using a pencil eraser (after making
sure there's no rubber bits left over!) making slot connections better.

If you can live with it a bit longer, I'd like to make a patch that
would show the module number.  Can you test this?  If so, what version
of NetBSD are you running?  I can build 1.4.1 or -current kernels, or
give you a source patch to try.

Simon.