Subject: Re: perhaps time to check our TCP against spec?
To: None <kml@gecko.nas.nasa.gov>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-net
Date: 04/07/1998 16:31:04
>I think that anything else is broken in the face of pmtu and
>asymmetric routing.  

Kevn, I am not quite arguing that. I'm just saying it's not yet a
standard, I personally would not call it ``widely deployed'', and am
not conviced you can rely on everybody else doing it.  

And I believe that, very probably, with some thought, you can do
*significantly* better than the one ACK per two packets.
Over very low-bandwidht or packet-rate-limited nets,
that's a significant win.

See the end of  the message for how.



>The packet sizes you are likely to get can
>have nothing at all to do with your advertised MSS (well, they 
>better be less than it).  If you sent an MSS of FDDIMTU (~4K),
>and wound up communicating over a PPP link with an MTU of 512 bytes,
>using pmtu, you'll ACK every 8 packets!

Kevin, I am not trying to argue against the performance optimization
you (collectively, dont know who came up with each idea, I'd be glad
to give credit wehre it's due: this one is Jason',s yes?) are doing
for PMTU.  They seem on the whole like  fairly good ideas to me.
There may be better ideas yet, though.


But (though I'd rather not be), I am banging on your heads to get you
to wake up, broaden your horizons a bit, and look back over your
shoulder at the big, old, ugly legacy non-PMTU world. 


It's still there, people have built networks that rely on the old
negotiated-MSS semantics.  You want to break that for NetBSD users and
leave them no recourse at all.  I am telling you that's not
acceptable.  You're more than welcome to do in_maxmtu, but ONLY so
long as the NetBSD  user has the chance  to turn it off.

Since the NetbSD user may not have control over the rest of their site
subnet, or their mobile-IP package, or the mobile-IP package their
peeers use.  And those peers may still be relying on the old behaviour.

I'm not saying you _can't_ do in_maxmtu. I *am* saying in_maxmtu
breaks existing required behaviour, and therefore it MUST be
configurable and the default (MUST or SHOULD) default to off, at least
if PMTU is off.

I do understand the points you're making about ACKing and in_maxmtu,
but I don't see how they in any way negate my points.
Am I missing something?


>In the presence of pmtu, I'd argue that full-sized segment is
>nearly impossible to define, and that ACKing every second segment
>is a win.

It's not that simple.  (read: wrong!)

I would try a definition of ``full-sized segment'' derived from the
instantaneious ``mtu' derived from PMTU.  And I would push that as far
as it can go. I think it works.  I've not had time to think about it
detail, that would take a couple of days, but apart from repeated PMTU
oscillation within the RTT estimator's envelope, I dont see how that
would break. (and how do you change PMTU faster than the RTT?).

See below for cheap, effective ways to compute it.

>I was originally unsure about ACKing every second segment -- it
>seemed like a hack.  

It seemed that way a little to me too for awhile. I'lll concede it's
reasonable. I do think we could get closer yet to the spirit of ``one
ACK every two full-sized segments'' and the story there is far from over.

.As I've thought about other cases, I've
>come to feel better about it.  Even if we are sending tinygrams,
>Nagle will prevent more from being sent until we ACK the initial
>tinygram, which means that we don't have a problem there.

Nope. Please think harder about X11 traffic and other traffic that
turns off Nagle.  I really, really don't want one ACK for every two
packets on a low-bandwidth link if I'm trying to do any mouse
movement, or keyboard input, or whatever.


>If Nagle is turned off and we're getting deluged with tinygrams,
>well, then, it doesn't seem like that great a sin to be sending
>back an ACK for every two tinygrams.

There *are* scenarios where it does make a clear difference.

>The one case I'd worry about is on very asymmetric links.
>Does anyone have any ideas about that?  I know that there has
>been *some* research, but how much, and what should we do
>about it in NetBSD?

That's ouside my direct research interests.  Personally,
I'm not familiar with the state of the art there. 

But there are three immediately obvious `quick hacks' which employ a
more explicit notion of ``the connection segment size'' as something
that varies over time., not just the initial
guaranteed-not-to-be-exceeded PMTU:

a)  For acking purposes, use the largest observed received segment size
    as an upper bound  on the MSS.  Learns about increases, doesn't
    learn about decreases.  Compare to Bellman-Ford routing or
    other dynamic programming.

b)  Do (a) but forget about what you've learnt periodically so that
    you get to notice reductions in the actual MTU in effect.
    Downside:  stretched ACKs immediately after forgetting,
    as you re-learn a better MSS.

c)  decay the observed maximum you've learnt.  Compare to exponential
    decay of RTT estimates.

I'd go with c), but I thought about this for less time in toto than
it's taken to write. (I think faster than I type.)  But (c) looks like
it might just be the `right answer' here, and not just for asymmetric
links, either, but for PMTU calculation of `MSS' for acking purposes,
in general.  


And, if you use it, please be kind enough to acknowledge me for coming
up with the idea.  Research == `will come up with ideas for food!''

At a higher leve(the problem is that you'd like to use the peer's
dynamicaly-discovered PMTU, but it's at the wrong end.

The, umm, _really obvious_ answer is to add a PMTU-change option to
TCP and to get PMTU hosts to send an update of their effective MSS to
the peer when the PMTU changes.  Is htere some reason not to push for
that?  Does the tcp-impl commmunity not think it's worth the effort?