Subject: cpu utilization during network traffic (100Mbit Ethernet), RFC 1323
To: None <tech-kern@netbsd.org>
From: Dave Olson <olson@bengaltech.com>
List: tech-kern
Date: 01/15/2000 23:03:07
I've been doing some performance testing and tuning, and
quickly noticed that I was burning anywhere from 30-75% of
a 400MHz Pentium II during tests with ttcp (with Intel 82559 fxp,
assorted tulip chips, 3com 905B, and a couple of others, so it's
not driver or chip-specific).  I was transferring about 150MBytes
of data, and getting anywhere from about 80 to 95% of wire speed,
so at least the bandwidth was OK.


After searching around for a while in dejanews, I found an old
reference to a poor RFC 1323 implementation in netbsd (in mail dated 1996!!!).

Sure enough, turning off a few of the 1323 tuning variables
with sysctl dropped the cpu overhead dramatically.  With the
following variables set to zero, the overhead is about 30-35%
on xmit, and still somewhat high on receive at about 50% (which
is still much better).  Linux on the same chips and test burns
about 30-35% of a cpu either direction.

Surely after 3 years it's time to either fix the broken code,
(I don't have time), or at least default the setting of it to off?

Relatively few sites will benefit from the 1323 (large windows) work
being enabled by default since it requires high latency, large
packets, or both to see substantial benefit, so I'd suggest defaulting
it to off.  It's a bit embarrassing to see all the claims of netbsd's
high network performance in the face of something like this...
In fact, I even got slightly higher throughput for this case (two systems
on the same subnet) with the variables turned off.  The 3 variables
I turned off were:
	sysctl -wn net.inet.tcp.rfc1323=0
	sysctl -wn net.inet.tcp.win_scale=0
	sysctl -wn net.inet.tcp.timestamps=0

I'm not sure that timestamps really made a significant difference;
I didn't test the 3 separately.

If we don't want to change these defaults, we ought to at least
be mentioning this issue in some prominent places in the man pages,
etc.

(Oh, this is 1.4.1, and also -current).


As part of my investigation, I tried to use the kernel profiling
tools, but quickly discovered that they aren't working correctly.
Very little of the time spent at interrupt level is counted and
even when a user program is burning every spare cycle, the data
from the profiler (as reported by gprof) shows that the cpu is
80% idle...  Has there been any work on improving the usefulness
of the profiler that I might have missed?

Does anybody have any suggestions on whether the rest of the
cpu-usage might be?  For comparison, Solaris on UltraSparc burns
around 25% on xmit, 10% on rcv, and as I recall from my SGI
days, IRIX burned less than 10% either direction for 100Mbit
(burned about 85% of a 250MHz r10k to get 600Mbit out of Gbit ethernet,
a case where the large windows did help.)


Thanks,

Dave Olson
Owned by 6 cats, owner of none...
Personal:  olson@bengaltech.com        Work:  olson@geocast.com
           http://www.bengaltech.com          http://www.geocast.com