Subject: Re: Concerns about our NewReno code
To: Bill Studenmund <wrstuden@NetBSD.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 11/08/2004 11:05:11
Bill,

If you are concerned about TCP performance issues between NetBSD and
{MacOS,FreeBSD-4.4}, then your task must be be to eliminate the
`squashed ACK' bug from consideration. That is, compare and contrast:

	a) TCP between NetBSD and FreeBSD 4-4.
	b) TCP between NetBSD and FreeBSD-4.5

or some later release than FreeBSD-4.5. If (dim) memory serves, the
changes to fix the so-called `squashed-ACK' bug were posted by Matt
Dillon to FreeBSD-hackers, then committed to FreeBSD's RELENG_4 branch,
circa November 2001. You want the third of a series of 3 commits by
Matt Dillon.)

That said -- without looking at the code, Jeffrey Hsu's description
sounds clean, perhaps cleaner than what we have now. Let's wait and
see what Jason says?


In message <20041108184922.GC20869@netbsd.org>,
Bill Studenmund writes:

>
>--E13BgyNx05feLLmH
>Content-Type: text/plain; charset=us-ascii
>Content-Disposition: inline
>Content-Transfer-Encoding: quoted-printable
>
>Recently, I was helping a customer debug a latency-sensitive application=20
>over TCP, where they were sending a lot of data from MacOS X. The TCP=20
>connection would just stop for over a second. Upon further investigation,=
>=20
>we realized we were seeing problems with multi-packet drop recovery. We=20
>were facing the exact issue that the "NewReno" code was designed to=20
>address.
>
>We then remembered that MacOS 10.3 (and a number of earlier versions) has=
>=20
>(have) a TCP stack taken from FreeBSD 4.4. So we started looking at the=20
>changes made to the FreeBSD 4 TCP stack, and found revision 1.107.2.36 of=
>=20
>their tcp_input.c, which has the following comment:
>
>*****
>Merge from current
>  rev 1.170:  Cosmetic-only changes for readability.
>  rev 1.187:  Fix NewReno.
>
>Rev 1.170 was done primarily to expose the shortcomings of the handling
>of t_dupacks field in the old NewReno code.  Rev 1.187 replaces the old
>NewReno logic with an implementation which closely follows the letter
>of the spec.
>*****
>
>"Fix NewReno"... So we looked into it, and made patches to Darwin, and it=
>=20
>seemed to work better.
>
>So now here's where this turns into a NetBSD issue. NetBSD's New Reno code=
>=20
>SURE lookes like the code FreeBSD and MacOS had. While the comment in=20
>FreeBSD's cvs is rather vague, hsu@freebsd dot org explained more in a=20
>follow-up ( http://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D557424+0+archive=
>/2003/cvs-all/20030119.cvs-all ):
>
>The state for when we are enter, are in, and leave the NewReno Fast=20
>Recovery
>period has been split out from t_dupacks into its own state variable,=20
>snd_high,
>which has the semantics described in the spec
>  RFC2582, NewReno Modification to TCP's Fast Recovery
>for the variable call "send_high".  Previously, this state was
>overloaded in the t_dupacks field of the tcpcb.  The problem with this
>is a number of conditions which reset t_dupacks such as data flowing
>back the other way, window size changes, and re-ordered acks which
>erroneously kick you out of Fast Recovery mode.  The end result
>is the TCP stack often has to wait for a timeout to retransmit, which
>would have been avoided if NewReno was working correctly.  Tom Henderson
>has analyzed before and after packet traces and the ones before were very
>sick.  Now, we correctly transition into and out of Fast Recovery, do the
>correct window adjustments on partial acks, and retransmit when we should.
>
>In addition, the variable named "send_high" in the spec has been split
>out from snd_recover, in order to make the check for more explicit
>and to detect for sequence wraparound.  This new version of the
>NewReno logic implements what the spec calls the Careful variant of Fast
>Retransmit, which is the version recommended by the spec.
>
>							Jeffrey
>
>
>
>So does anyone else think we need this change too? I can cobble a diff=20
>together, but maybe someone else'd like to look at this?
>
>Take care,
>
>Bill
>
>--E13BgyNx05feLLmH
>Content-Type: application/pgp-signature
>Content-Disposition: inline
>
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.3 (NetBSD)
>
>iD8DBQFBj7+yWz+3JHUci9cRArpPAJ95RpdtB6h2sdriXHf5QfSgQbfXXQCeIR//
>ilPzV404Y6bGb34iLrswjmU=
>=82/P
>-----END PGP SIGNATURE-----
>
>--E13BgyNx05feLLmH--