Subject: Re: 3c905B-TX performnace on (old) -current
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Hal Murray <murray@pa.dec.com>
List: current-users
Date: 02/24/1999 02:37:45
> The linux box can send at 75 Mbit/sec. Both netbsd boxes can send and
> receive at just over 60 Mbit/sec.
> So it looks like the 905B throughput is at least comparable to a
> de21140-based NIC.
I don't know anything about the 905B but I'm reasonably familiar
with Tulips and network performance in general.
The Tulip driver has some bug/glitch in it so that packets get lost
if you run it in full duplex mode at 100 megabits. I think the same
driver is used by most/all BSD derived systems. I'm pretty sure
the chip is OK - the DUnix driver works fine.
With Tulips, I get ~80 megabits TCP throughput on NetBSD. The same
hardware gets 90 megabits running on DUnix. That's with 600 MHz
Miatas.
The transmit side of the Tulip can only process one packet at a time.
With long packets, it will normally start sending bits out the wire
before it has DMAed the whole packet into the on-chip FIFO. If the
PCI/memory subsystem can't provide the rest of the data fast enough
the chip generates an error and the driver adjusts the when-to-start
transmitting threshold. While the Tulip is filling the FIFO up to
that threshold it isn't sending anything out on the wire.
That turns out to be significant - in the range of 3 to 6%. When
I count (all) the bytes on the wire, I'd expect TCP to run at ~95
megabits. I see 93, 90, or 87, depending upon what I've been doing
previously and how far the transmit threshold has been kicked up.
Especially if the CPU is busy, it's worth experimenting with larger
socket buffers (TCP windows). The window only needs to cover a round
trip time, but that time includes the CPU processing and network
card latency as well as speed-of-light delays on a long link. (You
can also reduce the window to verify that it isn't a problem.)
I generally use 128K. Mostly, that's leftover from several years
ago when I was testing ATM links on 233 MHz Avantis. 64K was just
under the knee of the curve and 128K was just over.
Similarly, longer buffers passed to send/recv help by reducing kernel
call overhead.