tech-kern: Re: TCP Problems (Was: RE: Windows packet size?)

Subject: Re: TCP Problems (Was: RE: Windows packet size?)
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: tech-kern
Date: 09/22/1997 21:54:33
Wow, I must have missed the first message on this thread... hehe, I've
just had my elbows in this part of TCP :-)

On Tue, 23 Sep 1997 07:27:11 +0300 
 Jukka Marin <jmarin@pyy.jmp.fi> wrote:

 > On Mon, Sep 22, 1997 at 10:38:29AM -0600, Alex Barclay wrote:
 > 
 > > 1) Micro$haft are following one of the RFC's for enhancing TCP throughput
 > > by avoiding fragmentation (hence the DF flag that you see)

Erg... do you know which one?  Anyhow, DF is used in PMTU discovery.  It
should be obvious why you don't want to fragment TCP segments at the IP
level - drop one frag, and you have to retransmit the entire segment.
Lose lose.

 > > 2) When a MS box tries to establish a connection it sends its MSS. The dest
 > > box replies with its MSS and MS takes the lower. (As an example - the box
 > > I'm running on here sends an initial MSS of 16K - I'm on token ring, my
 > > NetBSD box (about 20 hops away) replies with an MSS of 1496 (or
 > > thereabouts), MS now uses an MSS of 1496 and sets DF

When a host advertises an MSS, what it is saying is "This is the largest
segment I can receive".  The peer _should_ sever send a segment larger
than the min of "advertised MSS" and "the MTU of the path".

The MSS values for each side of a connection _can_ be asymmetric.  MSS is
purely a "this is what I can do, don't exceed it."

As of today (since I committed the changes :-), NetBSD's TCP advertises
the largest MTU of all connected networks (excluding loopbacks) less some
overhead as the MSS.  This is an optimization - why not just advertise
64K?  Because there are buggy hosts out there, so you want to minimize
lossage.

 > > 3) Assume we have a ppp link in the way which is using an MTU/MRU of 256
 > > (which il legal providing that it will correctly receive a 1500 byte packet)
 > > then the gateway that has the PPP link should send an ICMP host unreachable.
 > > This ICMP will trigger MS to reduce their MSS.

Right, "unreachable - need fragmentation".  This is supposed to trigger
the sender to step down to its next PMTU discovery step.  However, if
the intermediate router sends NEEDFRAG, it's supposed to notify the
sender what its MTU is.

 > > 4) A problem is that quite a few routers will dump the packet but fail to
 > > send the ICMP meaning that the link locks up and eventually fails. MS do
 > > present a strategy that should overcome this.

...as should all TCPs that implement PMTU discovery.  David Borman has
suggested an algorithm called "Black Hole Discovery".  I suggested a
slight (somewhat simpler) variation on it a while back... I'm not sure
yet which algorithm NetBSD's (nearly completed) PMTU discovery code will
implement.

 > Then NetBSD must have this problem.  I use NetBSD 1.2 machines as routers
 > and my win95 system can't talk to the net properly because my PPP link has
 > MTU of 576 bytes (or so, much less than 1500 bytes anyway).  If I change
 > MTU to 1500, everything works ok.

Could you please tcpdump on the NetBSD's ppp interface?  It's important
to know if NetBSD is sending NEEDFRAG to the MS box.  If it's not, that's
definitely a bug.  If it _is_, then the problem is on the MS host.

Quickly looking at ip_input.c:ip_forward(), we do send NEEDFRAG, but
I'd like to know if it's actually happening in your case.

Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                            Home: +1 408 866 1912
NAS: M/S 258-6                                       Work: +1 415 604 0935
Moffett Field, CA 94035                             Pager: +1 415 428 6939