Subject: Re: SOLVED! The cause of puzzling TCP (eg. WHOIS) connection failures with some InterNIC.net hosts
To: NetBSD Networking Technical Discussion List <tech-net@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: tech-net
Date: 11/22/1998 13:26:55
[ On Sun, November 22, 1998 at 00:56:34 (-0500), Michael C. Richardson wrote: ]
> Subject: Re: SOLVED! The cause of puzzling TCP (eg. WHOIS) connection failures with some InterNIC.net hosts 
>
>   You need to expand this a bit.
>   ICMP is a part of IP. It doesn't stand on its own. If you don't handle
> ICMP, then you haven't implemented IP. 

Yes, of course, but I've done what I think is to show that PMTUD is
*not* fine even in the face of proper and complete TCP/IP/ICMP
implementations at the end-points.  The users affected in the case of
failing PMTUD are rarely in control of the nodes with "incomplete"
implementations or configurations and one end of a connection broken
because of failing PMTUD can't even see why it's failing (as I'm all too
well aware, though now I can guess that a connection that had DF bits in
it, and which suddenly fails, is likey broken due to failing PMTUD).

>   This is known as black hole detection. 

Cool.  Where's that discussed and described (other than in this thread!)?
(it sounds familiar, but I've not yet heard it mentioned in any of the
reading I've done recently on this issue)

Does NetBSD implement it already?  How many host implementations don't?
How hard is it to implement?  (My guess is "not too hard" if PMTUD
already has hooks into TCP window size negotiation.)

>   Except that the PMTU work happens at the edges, and not on the router.

Huh?  PMTUD happens from one edge through to some node in the middle.
The receiving edge has nothing to say or do in the matter (unless
failing TCP connections can be used to guess at the cause and a to then
use a smaller MSS with connections from that source address in the
future and hope the application will retry the failed connection).

>   My point is that it may require a document that explains to them why
> they are non-compliant.

Yes, that's probably true, though I've been hoping that the problem is
self-evident and already documented in existing standards (albiet not as
explicitly as it could be).

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>