Subject: Re: SOLVED! The cause of puzzling TCP (eg. WHOIS) connection failures
To: None <tech-net@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: tech-net
Date: 11/22/1998 15:05:04
[ On Sun, November 22, 1998 at 11:09:57 (-0800), Marc Slemko wrote: ]
> Subject: Re: SOLVED! The cause of puzzling TCP (eg. WHOIS) connection failures  with some InterNIC.net hosts 
>
> - saying that PMTU-D is broken by design because you can setup filters
> that results in TCP connections not working is silly.  You can also setup
> filters to deny all SYN packets that prevent TCP connections from working;
> does that mean TCP connection establishment is broken by design?  It is
> arguable that it should have been a TCP option and there was debate about
> that, but that is done.

Obviously I don't agree!  ;-)

If an optional IP extension protocol causes TCP to become un-robust for
any reason whatsoever then the protocol is "broken by design".

However I don't want to throw stones directly at the protocol designers
(and those that studied and approved it).  In 1990 they probably didn't
expect analy retentive pedant firewall admins to filter ICMP "needs
frag" packets.  They were probably also so wrapped up in the thought
that TCP retransmissions would trigger ICMP retransmissions and thus
avoid problems that might be due to minor levels of packet loss.

> - the place to deal with broken firewalls is at the endpoints, not the
> routers.  Some systems already have implemented PMTU-D blackhole
> discovery.  This has to be done on the machine trying to perform the
> PMTU-D.

It seems that may also be possible to work around failing PMTU-D to a
certain extent on the receiving end too, at least for the case where it
can be assumed the application will re-try the connection at some
reasonably close point in the future.  Hmmm...  maybe not:

14:52:41.074524 204.92.254.2.2897 > 198.41.0.6.43: S 256283101:256283101(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 293047 0> (ttl 64, id 56284)
14:52:41.381707 198.41.0.6.43 > 204.92.254.2.2897: S 1824508650:1824508650(0) ack 256283102 win 8760 <mss 1460> (DF) (ttl 238, id 28781)
14:52:41.382190 204.92.254.2.2897 > 198.41.0.6.43: . ack 1 win 17520 (ttl 64, id 56287)
14:52:41.385389 204.92.254.2.2897 > 198.41.0.6.43: P 1:6(5) ack 1 win 17520 (ttl 64, id 56288)
14:52:41.735992 198.41.0.6.43 > 204.92.254.2.2897: . ack 6 win 8760 (DF) (ttl 238, id 28782)
14:52:42.564456 198.41.0.6.43 > 204.92.254.2.2897: P 1:4(3) ack 6 win 8760 (DF) (ttl 238, id 28783)
14:52:42.564849 204.92.254.2.2897 > 198.41.0.6.43: . ack 4 win 17517 (ttl 64, id 56309)
14:55:00.049618 198.41.0.6.43 > 204.92.254.2.2897: R 4:4(0) ack 6 win 0 (ttl 48, id 2812)

The above is a trace from my end of a failing connection.  Unfortunately
it doesn't really appear to have failed -- just stalled (though there is
an odd-size packet and a resend).  This is pretty typical too....

I think I've also shown to some degree that it can also be dealt with in
the router that's sending the ICMP "needs frag" packets, if only by
following MCR's suggestion.  If the receiving end can't do black-hole
detection then indeed routers *should* try to do something about it (at
least until black-hole detection is a mandatory part of PMTU-D on the
hosts that use it.

I *do* want to learn more about PMTU-D "black hole discovery", such as
how it relates to my own ideas and how it is implemented and how I can
support it.

> If you have the first case, there is no real reason to block the ICMP
> can't fragments except to avoid revealing information about your internal
> network topology if you filter things so traceroute, etc. don't work.

Assuming your internal topology uses an MTU that's greater than, or
equal to, the MTU on your gatway throughout I don't see how ICMP "needs
frag" packets will ever be generated, never mind how they'll reveal any
internal topology.  Granted this won't be a valid assumption for some
networks that are really internets themselves, but 

> In the second case (which is the one that I see far more often, since many
> people just block all ICMP to web servers), there is a valid reason to
> block the can't fragments.  That reason is to avoid a DoS attack based on
> forging can't fragment packets telling A that the MTU is very very low
> (especially if the implementation don't place a reasonable minimum on the
> size it will drop down to).  If you want to filter ICMP can't fragments
> for that reason, no problem.  Do it.  But then you need to disable PMTU-D
> on your machines that are behind the filter.

You'd think people smart enough to understand that they need to block
ICMP "needs frag" packets from reaching any of their high-risk servers
would be smart enough to realize that they need to prevent those servers
from requesting such packets.  (And of course if PMTU-D is totally
disabled on such servers there should be no need to block packets
destined for them in the first place -- assuming the implementations are
secure and in fact completely ignore such packets when they don't need
them!  ;-)

Maybe there's a rash of bad/wrong advice going around in the web-server
admin world that needs correcting.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>