Subject: Re: Processor correctavke error?
To: None <cgd@pa.dec.com, mjacob@feral.com>
From: Matthew Jacob <mjacob@feral.com>
List: port-alpha
Date: 06/10/1998 15:45:12
>> Chris - I'll have to ponder this. Ultimately, most of this stuff
>> will get covered under DIAGNOSTIC anyway (for reporting to the console
>> about errors),
>
>Actually, I disagree.  If we're listening for correctable error
>reports, it is probably correct to print them (or just log them),
>regardless of DIAGNOSTIC or DEBUG.
>
>Users should be informed about hardware-related errors that the kernel
>knows about.
Fair enough.
>
>
>> but I really can't quite believe that you've just
>> made an argument that goes 'Disable Correctable Error Reporting
>> and All Will Be Well'- which is how I have (mis?)understood your
>> mail to read.
>
>No, that's exactly what I meant.
>
>The Green Book says unequivocally that, if correctable error reporting
>is disabled, the correctable errors will be corrected automatically
>(presumably by the PALcode).
>
>I interpreted the surrounding text to mean that if reporting is not
>disabled, they'll still be corrected, and that additionally the error
>will be reported.  However, that interpretation may be incorrect.
>
>If my interpretation was incorrect (and some ideas on the matter from
>those more familiar with PALcode would help; Ross?), then you're faced
>with a tradeoff:
>
>	* disable correctable error reporting, knowing that (according
>	  to the architecture reference) the errors will be corrected
>	  properly for you.
>
>	* keep correctable error reporting enabled, and have to write
>	  platform- and cpu-specific code to correct the errors.
The latter, in fact, is exactly what I've been doing for the 8200 && 4100.
>>From a maintenance perpective (and an "availability of documentation"
>perspective), the former is very attractive.
Absolutely. I think that the approach you took with this originally
(keep it to the Green Book) has worked very well. But I'm not sure
it'll work in all cases- I don't believe that the PAL code does in
fact handle recoverable TLSB, DWLPX or MCPCIA errors (for example).
But I'm certainly willing to be dissuaded from writing this extra
support if we don't need it.
-matt