Subject: Re: Is anyone seeing lots of TCP checksum errors?
To: John F. Woods <jfw@jfwhome.funhouse.com>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: current-users
Date: 11/07/1997 15:05:19
On Fri, 7 Nov 1997 13:57:52 -0500 (EST) 
 "John F. Woods" <jfw@jfwhome.funhouse.com> wrote:

 > One very obvious suspect is, of course, the TCP checksum code on NetBSD, so
 > I am wondering if anyone else is seeing TCP checksum errors, especially in
 > situations where on-the-wire corruption can be reasonably ruled out.
 > (Conversely, can anyone think of perfectly natural reasons why I would see these
 > errors in the normal course of affairs?)

        567752 packets received
                192055 acks (for 178138975 bytes)
                16319 duplicate acks
                0 acks for unsent data
                360819 packets (132490877 bytes) received in-sequence
                5471 completely duplicate packets (141406 bytes)
                6 old duplicate packets
                153 packets with some dup. data (15201 bytes duped)
                8800 out-of-order packets (1165156 bytes)
                11 packets (2 bytes) of data after window
                2 window probes
                4105 window update packets
                10 packets received after close
                821 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short

That's not really a very high "bad checksum" rate.  This is an SS2
with a fairly busy Ethernet, and I could imagine that a few garbage
packets arrive from time to time.

Note that in_cksum() is used for TCP, which is also used for UDP and
IP headers.  If you are seeing bad checksums due to a bad in_cksum(),
I'd think it would affect IP headers and UDP as well.  What metrics do
you have, here?

Since you're seeing this through PPP, I could very well believe that
you're simply getting a bit of garbled data from time to time.

If you _really really_ suspect in_cksum(), try replacing it with
one of the old, slow, known-to-be-100%-correct C implementations and
see if you have the same problems.

Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                            Home: +1 408 866 1912
NAS: M/S 258-6                                       Work: +1 650 604 0935
Moffett Field, CA 94035                             Pager: +1 415 428 6939