Subject: Re: Verifying a kernel.
To: Rhialto <rhialto@falu.nl>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 07/20/2005 17:36:47
In message <20050721000924.GH26315@falu.nl>,   Rhialto writes:

>On Wed 20 Jul 2005 at 14:55:34 -0700, Jonathan Stone wrote:
>> Of course, a CRC-32 isn't going to catch errors where xor'ing the
>> original message and the damaged message gives you a polynomial
>> that's  divisble by the generator polynomial. But how likely is that for
>> "Naturally occurring failures", as opposed to enemy action?
>
>Note that if, for instance, a single 32-bit word is set to all zeros,
>that is NOT (in general) a single 32-bit error. Since some of the
>original bits (in general) were zero to begin with, you have multiple
>shorter errors.  Hence guarantees like "catches all single error bursts
>of up to 32 bits" do not guarantee anything here since they do not
>apply.

No, that's flatly incorrect.  The well-known theoretical guarantees
are based on treating the message *and* the "error" as polynomials,
then using Galois (finite) field theory.

The standard terminology, at least as applied to communications, is to
define the length of a "burst error" as the number of bits between the
first and last non-zero bits, inclusive, in the xor of the original,
undamaged and the damaged message.  References include:

@Book{peterson-error-codes-book,
  author =       {W. W. Peterson and E.J Weldon},
  title =        {{Error Correcting Codes}},
  publisher =    {MIT Press, Cambridge, Massachusetts.},
  year =         {1972},
  edition =      {2nd.}
}

@Book{macwilliams:coding:theory,
  author = 	 {F.J. MacWilliams, N.J.A. Sloane},
  title = 	 {The Theory of Error-Correcting Codes},
  publisher = 	 {North-Holland},
  year = 	 {1977}, 
  address = 	 {Amsterdam}
}

@Book{Blahut:error:codes,
  author = 	 {R.E. Blahut},
  ALTeditor = 	 {},
  title = 	 {Theory and Practice of Error Control Codes},
  publisher = 	 {Addison-Wesley},
  year = 	 {1994}
}

If you work through the exercises in the releveant chapters of Blahut
(enough to see how to use a CRC syndrome to do error correction),
you'll be in a much better position to comment.  IIRC, the first of
Peterson's textbook has some cute shortcuts to the same result.


The origins of Cyclic Redudancy Checks is widely attributed to Prange:

@TechReport{prange-1957-cyclic-codes,
  author = 	 {E. Prange},
  title = 	 {{Cyclic Error-Correcting codes in two symbols}},
  institution =  {Air Force Cambridge Research Center, Cambridge, Mass.},
  year = 	 {1957},
  number = 	 {AFCRC-TN-57-103},
}


(It's been 4 or 5 years, but I think the standard claims about
error-detection, and the classing shift-register implementation, date
back to Prange's original papers: the one above, and a AFCRC-TN-58-156
from 1958.)