tech-net: Re: paper on improving Webserver performance

Subject: Re: paper on improving Webserver performance
To: None <mallman@grc.nasa.gov>
From: Mohit Aron <aron@cs.rice.edu>
List: tech-net
Date: 07/07/1999 10:43:48
> 
> I took a quick read of this paper while injesting my morning
> caffeine a few moments ago.  There are some very interesting ideas
> in here, but I wanted to note one thing about the RTO timers.  Vern
> Paxson and I have plumbed some of these waters and found that the
> single biggest property a good estimator should have is a fairly
> large minimum RTO, which the proposal in the above paper does not
> provide.  In our analysis of a large number of TCP connections over
> ~1000 Internet network paths we found that by making the RTO more
> aggressive you increased the number of spurious retransmissions by a
> fairly hefty amount (which also hurts performance).  The following
> paper details our findings:
> 
>     Mark Allman, Vern Paxson. On Estimating End-to-End Network Path
>     Properties. ACM SIGCOMM, September 1999.  To appear. 
>     http://roland.grc.nasa.gov/~mallman/papers/estimation.ps
> 
> So, the point that I am attempting to convey to the NetBSD TCP folks
> is that it seems as if the jury is still out on how to do RTO timers
> right.  My opinion is that further Internet testing is required
> before we have an extremely good understanding of RTO estimators and
> therefore it may not be a good time to go messing with the
> production netbsd code.  
> 
> Just my $0.02!
> 


The above is  probably correct. However, the paper does the following:

1) The vanilla TCP RTO estimation algorithms are used, albeit with a
   fine-grained clock. What you're suggesting above is that this estimation
   algorithm still doesn't provide good enough estimates in some cases - 
   I don't think one should start using a coarse-grained clock for that reason.
   Perhaps more research needs to go into the estimation algorithms.

2) The paper anyway suggests putting a minimum bound of 200ms on the timeout
   because the ACKs might be delayed by that amount. Since most RTTs are well
   smaller than this, this might provide a sufficient cushion until the 
   estimators are fixed. Another interesting thought (not covered in the paper)
   is that one can even get around by not putting a minimum bound of 200 ms
   - e.g. when there are enough packets in the window, you don't expect ACKs
   to be delayed. However, I did realize that in the current Internet a
   somewhat fat RTO is desirable and hence didn't make this proposal.


- Mohit