Subject: Re: Transmit interrupts, fxp driver for Intel 82557/8/9 Ethernet
To: Hal Murray <murray@pa.dec.com>
From: Erik Rungi <blackbox@openface.ca>
List: current-users
Date: 05/09/2000 09:56:57
We have a NetBSD 1.4.1/i386 box being used as a generic router on our network
and we were running into "fxp0 timeout" problems every few days (but with
seemingly random intervals between), but especially during peak traffic
periods. It was enough to make the box unreliable as a router so we switched
to ex cards instead. Peak traffic was probably around 4-5Mb/s bidirectional
with about 1000pps per direction being forwarded (just a ballpark estimate),
standard mix of realworld net traffic.
erik
On Tue, 9 May 2000, Hal Murray wrote:
>
> The fxp driver doesn't use transmit interrupts. Under normal traffic
> patterns this works fine. The work gets done on the next receive
> interrupt, saving the CPU cycles that would have been burned by the
> overhead on each transmit interrupt.
>
> But it doesn't work very well if you run crazy test programs that
> only send UDP packets in one direction. The watchdog timer eventually
> catches things, but that takes a long long time and it puts a lot
> of "fxp0 timeout" messages on the screen.
>
> Netperf includes a test like that. It's a simple way to check for
> the classic live-lock troubles where the receiver spends all its
> CPU cycles processing a packet only to discover that the input socket
> queue is full so it drops the packet.
>
> Has anybody else encountered this quirk? Is this enough of a bug
> that I should send a PR?
>
> I found the place in the code where it sets up the transmit command.
> (That's not hard, the symbol is only used once.) When I or-ed in
> the interrupt bit, things worked as expected. (I forget to measure
> how much more CPU that used.)
>
> I tried turning on the interrupt bit when the transmit queue was
> 3/4 full but either I botched the code or that wasn't a good enough
> heuristic.
>
>
>
> I think I've also discovered a variation of this glitch. If the
> TCP window is big enough, throughput drops way off. The break happens
> between 190K and 195K. That's slightly over 128 packets which is
> the size of the fxp transmit queue.
>
> On a pair of 400 MHz Celerons, throughput drops from 90 megabits
> with the window under 190K bytes to under 20 when the window is over
> 195K bytes.