Subject: Re: TCP/Westwood+ support.
To: Kentaro A. Kurahone <kurahone@sekhmet.sigusr1.org>
From: Charles M. Hannum <abuse@spamalicious.com>
List: tech-kern
Date: 01/01/2005 19:22:36
On Saturday 01 January 2005 14:49, Kentaro A. Kurahone wrote:
> I've implemented TCP/Westwood+ congestion control for NetBSD.  The authors
> claim that it deals better with high BDP lossy networks (like wireless).
> Since I'm using it to ssh to another box to write this e-mail, I'm fairly
> confident that I didn't break anything.
>
> Description: http://www-ictserv.poliba.it/mascolo/tcp%20westwood/homeW.htm
> Patch:
> http://www.sigusr1.org/~kurahone/tcp-westwood+-netbsd-2.99.11.diff.gz
>
> Feedback will be appriciated.

If I understand your code correctly, it makes two changes to the algorithm:

a) A running bandwidth estimate based on the ack rate is kept, and ssthresh is 
set according to that -- in other words, we do exponential growth up to the 
estimated bandwidth, and then linear growth thereafter, whereas Reno will 
just keep trying to increase ssthresh forever.

b) On a fast retransmit, the congestion window is initialized to ssthresh -- 
in other words, we always set to the linear growth point on a fast 
retransmit, and never do exponential growth again, except in the case of slow 
retransmit (or CWM).

Both of these behaviors are, on their face, more conservative than Reno.

My question is: what happens if you have rapid fluctuations; e.g. due to 
sharing a link with another system that is doing occasional short 
transactions that are not really congestion-controlled?  It appears to me 
that Westwood is fairly slow to react in decreasing ssthresh and switching to 
the more conservative linear growth.  This could be problematic in some 
circumstances.


I also see a few problems with the implementation:

1) In the fast-retransmit case, you are blindly setting cwnd to ssthresh; if 
cwnd is already less than ssthresh, you should not do this.

2) Your simple arithmetic in tcp_westwood_p_bwe() is potentially susceptible 
to roundoff issues on low-bandwidth links, similar to the ones Brakmo and 
Peterson complained about in the TCP Vegas paper (and that we fixed years 
ago).  It's probably not as bad since you're dealing with byte counts rather 
than packet counts, though.

3) In tcp_westwood_p(), you are always setting dupacks to 0.  This is a bit 
odd in that it could cause multiple fast-retransmit actions to fire within 
one window.  It also breaks detection and recovery of cwnd after the fast 
retransmit is acked and we're caught up -- but all that does it set 
cwnd=max(cwnd,ssthresh), and we've already done that in the Westwood case.  I 
think this effect is not very clear from the code.


I think the congestion control should be separated into pluggable functions.  
WIth Reno, New-Reno and Westwood, there are getting to be far too many 
conditionals in that code, and it's becoming a lot less clear how it operates 
in the usual case.