Subject: Re: Examining core dump..
To: None <cgd@pa.dec.com, mjacob@ns.feral.com>
From: Matthew Jacob <mjacob@ns.feral.com>
List: port-alpha
Date: 11/10/1997 09:56:03
>From cgd@pa.dec.com  Mon Nov 10 09:53:16 1997
>> Machine checks are for exception conditions in alpha. Interrupts
>> as well as memory errors.
>> 
>> In -current (and for a while) all of the know machine check
>> conditions don't lead directly to a panic. There should have
>> been a printf like:
>> 
>>   panic("unexpected interrupt: type 0x%lx vec 0x%lx a2 0x%lx\n", a0, a1, a2);
>> 
>> What were the contents. What was the release && h/w you're running anyway
>> (I forget).
>
>
>This is a really scary statement, considering that you've apparently
>changed the interrupt delivery code substantially.

Nope, I haven't- at least I don't think so.

>
>
>On the Alpha, exceptional conditions like device interrupts,
>interprocessor interrupt requests, performance monitor interrupts, and
>machine checks are expressed as "interrupts."  (There are other types
>of exceptional conditions, e.g. memory management faults, instruction
>faults, unaligned access faults, system calls, floating point
>exceptions, etc., which are expressed differently.)
>
>Machine checks (and their close cousins, correctable errors) indicate
>a hardware problem or serious software bug.  Correctable errors
>typically signal things like memory bit-flips (correctable by ECC).
>Machine checks either indicate uncorrectable memory problems, "other
>hardware problems," or software bugs (like the OS touching device or
>memory space where there was no device or memory).
>
>
>The default handler for machine checks and correctable errors (and,
>looking through the code, nothing seems to install a custom handler)
>causes the following behaviour:
>
>	(1) On correctable errors, a warning message is printed.
>
>	(2) On expected machine checks (during device probes),
>	    a flag is set.
>
>	(3) On unexpected machine checks, the system panics.
>
>
>You seem to indicate that interrupts and memory errors are both types
>of machine checks.  (That's what your first sentence says...)  This is
>absolutely incorrect.

I stand corrected.

>
>Machine checks and correctable errors are both types of interrupts.
>(Actually, they're both the _same_ type, and you distinguish between
>them by bits in the Machine Check Error Summary register.)  Memory
>errors can cause either machine checks or correctable errors,
>depending on whether or not they were, in fact, correctable.
>
>
>On a slightly different note, you might note that the current version
>of the alpha "interrupt.c" has copyright notices on it that
>technically prohibit it from being used or distributed.  "You might
>fix that."  (It's your copyright notice, Matt.)

It is? Oh, sorry. Thanks, Chris- I've been being sloppy- you know me-I'm
really too easygoing about all of this.

Thanks for chiming in with the correct info, Chris..

-matt