Subject: Re: CPU memory read ECC error
To: Aaron J. Grier <agrier@poofy.goof.com>
From: Simon Burge <simonb@netbsd.org>
List: port-pmax
Date: 08/15/1999 11:02:05
"Aaron J. Grier" wrote:
> I recently filled my 5000/240 with gobs of 8MB modules... that was four
> days ago. I tested them with the console tester before I fired
> everything up, and they all tested good. However, now I'm getting:
>
> goldberry:/var/log# grep ECC messages | cut -c36-80 | grep CPU | sort | uniq
> CPU memory read ECC error at 0x04403c98
> CPU memory read ECC error at 0x0442dc98
> CPU memory read ECC error at 0x04473c98
> [etc...]
>
> FWIW, if I subtract the highest address from the lowest one, assuming
> these are byte-indexed addresses, then they all fall within an 8MB
> range. My hope is that I have a single module that I can fix by
> wiggling it a little bit. (I know I've had to do that before...)
>
> How do I map these to physical addresses so I can figure out which
> memory module(s) to wiggle?
RAM starts at 0 - 0x04473c98 ~= 8.5 * 0x800000, so wobble or throw
out the 9th module from the back (I think that /240's have the slots
numbered). I've also had success using a pencil eraser (after making
sure there's no rubber bits left over!) making slot connections better.
If you can live with it a bit longer, I'd like to make a patch that
would show the module number. Can you test this? If so, what version
of NetBSD are you running? I can build 1.4.1 or -current kernels, or
give you a source patch to try.
Simon.