Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: "Hard memory error" -- where?



At 10:11 Uhr -0400 13.8.2008, Michael Lorenz wrote:
>> Is there any way to learn from the panic message _which_ module
>> displays the hard error?
>
>Not without the fault's physical address.

Which the panic message doesn't provide? Pity. Anything I can do from the
debugger prompt if the problem should reaccur?

>> Or any memory test tool that would provide module information?
>
>Yes, the firmware will do just that.
>IIRC all you need to do is to run 'selftest' on the /memory node.

"help test" recommends "test /memory".

>Also, I'm not sure if you need to
>change the memtest-megs# PROM variable

Yes, apparently you do:

<#0> ok setenv selftest-#megs 512
selftest-#megs =      512
<#0> ok test /memory

<#0> ok

>Most OBP versions will tell you outright which slot contains the
>faulty module. Some others will at least give you a physical address,
>you may need the SS20 Service Manual to translate that into a slot
>number

I have that around.

Unfortunately, the error seems to be intermittent, and the OFW memory test
is not thorough enough to reproduce it. So I guess I have to wait till it
shows up again, then try to collect more information...

Thanks so far,

        hauke


--
"It's never straight up and down"     (DEVO)




Home | Main Index | Thread Index | Old Index