Subject: Re: BUG IN IF_ED DRIVER PERSISTS UNTIL TODAY.
To: Brian Buhrow <buhrow@cats.ucsc.edu>
From: Charles M. Hannum <mycroft@mit.edu>
List: tech-kern
Date: 08/30/1996 15:36:21
buhrow@cats.ucsc.edu (Brian Buhrow) writes:

> 
> 	The problem is  in the handling of a hardware error condition.  If the
> card resets during a particularly busy set of traffic flows, the ring
> buffer pointers can get balloxed up, causing data corruption in the
> outgoing packet.  While TCP will detect that the packet didn't make it to
> its destination, it will wrongly resend the generated packet, which is now
> garbage, thanks to the fine chip makers at National Semiconductor, which
> won't get through because the packet doesn't pass the IP checksum, which,
> of course, it shouldn't.
> 	The problem in the driver is that if it resets the card, due to the chip's 
> failure, it doesn't return a different status to the sending output
> routine.  Here's the relevant section of the driver.
> routing.
>
> [...]

The first piece of code you quote has to do with packet reception.
This code `can't fail', unless we are out of mbufs, in which case the
packet is silently dropped.  I don't see a problem here.

The second piece of code has to do with packet transmission.  This
code doesn't modify the mbufs in kernel memory, so there's no way it
could corrupt the retransmissions.  So far, I still don't see a
problem.

The one place I do see a (minor) problem is that, even if the transmit
DMA fails to complete, we send whatever happens to be in the device's
memory as a packet (*once*).  This will almost certainly result in the
packet being dropped whereever it arrives.  So far, I *still* don't
see a problem.  In addition, unless you're actually seeing `remote
transmit DMA failed to complete' messages, this isn't relevant at all.
(This should be fixed anyway, of course.)

I believe your analysis is incorrect.