Subject: Re: VIA VP2 chipset
To: None <port-i386@NetBSD.ORG>
From: Wolfgang Rupprecht <wolfgang@wsrcc.com>
List: port-i386
Date: 02/05/1998 10:07:32
ws@tools.de (Wolfgang Solfrank) writes:
> Note that the PowerPC 604 processor does have parity protection for its
> on chip cache.  It (and the 603, too) even has parity protection not only
> on its data bus, but also on its address bus, and it even honours it when
> snooping the bus.

Interesting.

Although it strikes me that a real fault tolerant (fault detecting)
system would still need to also carry ECC/parity bits end-to-end to
all the chips attached to the data bus.  Certainly one would need all
the IO chips to generate parity and send it.  For example, what good
would it do to have memory ecc-ed all the way to the internal cpu data
bus, only to have noise corrupt data on the bus between the ethernet
DMA and memory.  (And who knows if this isn't happening in real life.
It may well be the cause of some of the bad IP checksums that one
occasionally sees on direct ethernet connections.  These bad IP
checksums clearly aren't happening on the wire since the ethernet CRC
would flag them and there aren't enough bad CRC's to account for such
a large number of bad packets slipping by.)

Just to add some x86/netbsd relevance here, I wonder if an
parity/uncorrectable ECC error in netbsd souldn't be handled a bit
more like a SEGV if it occurs in user pages.  If we have a 64Meg
machine and the kernel only uses all of 4 megs or so, we will have a
much higher probability of memory corrupting a user page than any
kernel page.  Panicing the kernel is probably an over-reaction.  I
recall Sunos (4.1 ???) had a layered approach that was kind of clever.
What I recall of it:

	user write-protected text pages - refetch from disk
	User data pages - kill program
	Kernel pages - panic
				
-wolfgang