Subject: Re: TCP socket buffers automatic sizing
To: Mindaugas R. <rmind@NetBSD.org>
From: Greg Troxel <gdt@ir.bbn.com>
List: tech-net
Date: 07/21/2007 08:03:27
> [1]. http://www.netbsd.org/~rmind/tcp_buf_autosizing.diff

tcp(4) should discuss this.

Is there an RFC about this, or an implementation elsewhere?  If so, that
should be mentioned.  A quick search turns up prior implementation in
NetBSD 1.2 (!):

  http://www.psc.edu/networking/ftp/papers/autotune_sigcomm98.ps

an early work that uses pre-measurement:

  http://dast.nlanr.net/Projects/Autobuf/autotcp.html

a survey article:

  http://www.lanl.gov/radiant/pubs/hptcp/hpdc02-drs.pdf

a paper that seems to talk about shrinking buffers too:

  http://citeseer.ist.psu.edu/dovrolis04socket.html

FreeBSD had a patch in the fall - I don't know if it's in now:

  http://www.freebsd.org/news/status/report-2006-10-2006-12.html#Automatic-TCP-Send-and-Receive-Socket-Buffer-Sizing

a summary page with lots of links

  http://kb.pert.geant2.net/PERTKB/TCPBufferAutoTuning


Have you looked at connections with tcpdump2xplot and xplot (both in
pkgsrc/graphics/xplot)?  It's very illuminating about TCP behavior.


I think I may have misunderstood in my previous comments.  How does the
advertised window relate to the current allocated buffer size?  If it's
only what's allocated, then my concerns about dropping in-window
segments are probably at least mostly incorrect.  But, we have to
advertise a large window to get the sender to open up, even if our
application reads the data promptly.

I don't understand the "only if no reordering" constraint.  It would
seem that increasing the buffer is warranted if we receive a segment
that we'd like to store.  But I can see the point that if we missed a
segment, and therefore have not consumed data that we would have, that
the buffer doesn't need to stay big.

Consider the case of a network that has no drops and no reorders, a
large bandwidth-delay product, a sender with a large buffer, and a
receiver with an application that receives promptly.  The rx buffer will
remain small.  Now, assume one dropped packet.  The entire in-flight
data will arrive and need buffering, at least until fast retransmit or
SACK causes a resend (RTT plus a few packets).  Given that we've
advertised the receive window, it seems rude and contrary to the TCP
RFCs to not accept the segments if we have memory (if we don't have
memory, we've overcommitted and gotten unlucky, which is central to this
scheme - I'm not objecting to that case).

How does the receive window get set?  Should it be the max that autosize
will go to?  If it's not large, how does this help?

Shouldn't SB_AUTOSIZE only be enabled on sockets if the sysctl is on?
Shouldn't there be a socket option?  Does expliciting setting a buffer
size clear SB_AUTOSIZE?

I think this is a good feature, especially for busy servers, so I'm not
saying you shouldn't move forward.  I wouldn't even say that the above
is cause not to commit it.

Probably this should default to off at first (sysctl intial setting)
until more people have run it, which will happen once it's in current -
sysctl -w is easier than patch/rebuild.  I'd turn it on, but haven't
done a reboot cycle to try it.