Subject: patch! Re: sbappend() is not scalable
To: None <firstname.lastname@example.org>
From: Alfred Perlstein <email@example.com>
Date: 10/11/1999 18:04:55
On Fri, 8 Oct 1999, Mohit Aron wrote:
> I recently did some experiments with TCP over a high b/w-delay path
> and found a scalability problem in sbappend(). The experimental setup
> consisted of a 100Mbps network with a round-trip delay of 100ms. Under this
> situation, FreeBSD's TCP version is incapable of attaining more than 65 Mbps
> on a 300MHz Pentium II - even without slow-start.
> I tracked down the problem to sbappend() - the routine that appends user data
> into the socket buffers for network transmission. Every time a TCP ACK
> acknowledges some data, space is created in the socket buffer that permits
> more data to be appended. Unfortunately, the implementation does not maintain
> a pointer to the end of the list of mbufs in the socket buffer. Thus each
> time any data is added, the whole list of mbufs is traversed to reach the
> very end where the data is added. Since the b/w-delay product is large, there
> can be about 600 mbufs in the socket buffer waiting to be acknowledged. Thus
> upon every ACK, about 600 mbufs are traversed causing the TCP sender to run
> out of CPU.
> The problem is not limited only to high b/w networks - it is also present in
> long latency paths (satellite links). Thus a server transferring a large file
> over a satellite link can spend lot of CPU due to the above problem.
> Hope the problem shall be fixed in future releases,
I'm not sure how well these patches will apply under NetBSD but I've
got something in the works for FreeBSD, it seems to work pretty ok
but I'd like a larger audiance to test it out.
The patches are for FreeBSD-current as of this morning.
The patches also have a smarter (imo) version of sbcompress()
that will attempt to copy less data if it can. I apologize
in advance for the gratuitous style changes but I needed to
make the code a bit more readable.
Any feedback would be much appreciated as I don't have a
high delay and bandwith LAN to work with.
I'm also pretty sure i'm not sub'd to this list, so please
don't neglect to cc me.
-Alfred Perlstein - [firstname.lastname@example.orgemail@example.com]
Wintelcom systems administrator and programmer
- http://www.wintelcom.net/ [firstname.lastname@example.org]