Subject: Re: bridge(4) and silent data corruption :-(
To: Sean Doran <firstname.lastname@example.org>
From: Dennis Ferguson <email@example.com>
Date: 04/30/2002 10:47:01
> I wrote:
> | I don't see how/why the router should behave differently depending
> | on whether there is a NetBSD box bridging or not...
> | There is no NAT. (Even if there were, why would the presence of
> | a bridging NetBSD box make a difference?)
> For the sake of clarity, the question in parentheses is NOT rhetorical,
> or meant to be argumentative. It's the crux of the problem. :-(
Do you see CRC errors on the station-a box when the station-a box is
getting traffic through the router? I'm wondering if running the
interfaces promiscuous has somehow turned off the media-level error
checking so the bridge is forwarding (and correcting the CRCs on)
packets that would normally be dropped. Note that this would also
need to be happening on the Mac when it is running tcpdump since the
44003 frame that causes the problem in your dump showed up there too.
> What on earth is munging segments but leaving the checksums intact
> (or fixing them), but only for traffic that leaves the bridged LAN,
> or arrives on it from elsewhere?
The classic ways this can happen are either getting so many errors
that one of the packets hits the 1/65535 probability of getting the
checksum randomly right, or that something swaps chunks of the packet
so that you get the same data in a different order. In the latter case
if the router were sending packets errored like that with a bad CRC
which the bridge ignored you would have a problem like you are seeing
(I know this is stretching a bit).
Is the circuit on the far side of the router DSL? ATM presents so
many opportunities to screw things up just like this that I no longer
trust it for anything at all. Note that the last two packets from
world.22 in your dump indicate that something that reorders stuff
exists somewhere in the path, and I hate the idea of both that and
ATM appearing in adjacent sentences.