Subject: Re: ICMP attacks against TCP
To: Miles Nordin <carton@Ivy.NET>
From: Fernando Gont <fernando@gont.com.ar>
List: tech-net
Date: 12/14/2004 02:46:41
At 03:40 10/12/2004 -0500, Miles Nordin wrote:

>    fg> Hope to get your constructive comments, now. :-)
>
>in 7.2.2 you suggest ignoring PMTUD messages until you get a bunch of
>them.  This seems clumsy, and could really slow down PMTU convergence.

May be the draft was not clear enough. While the draft states that the 
proposed fix for the PMTUD is analogous to that of delaying the conenction 
reset, it doesn't recommend any values for MAXSEGREXMIT and MAXPKTTOOBIG. 
That is, you could, for example, set MAXSEGREXMIT to 2 and MAXPKTTOOBIG to 1.

That is, the packet should be transmitted at least twice, and we should get 
at least one ICMP DF.

You can play with these two constants. Setting MAXPKTTOOBIG to a higher 
value than that of MAXSEGREXMIT means, somehow, that you acknowledge that 
packets may be lost, or that routers may be rate-limiting their ICMP traffic.

The rationale behind this fix is to wait for at least 1*RTO, so that we can 
be "assured" the packet had enough time to be acked.

Note that the first RTO shouldn't be quite different from the RTT.

You could set MAXSEGREXMIT to 1 and MAXPKTTOOBIG to 1, and the delay in the 
update would be in the order of the RTT.

Also, as stated on the draft, these constants could be a function of the 
advertised Next-Hop MTU.

For example, when the advertised Next-Hop MTU is < 500, we could set 
MAXSEGREXMIT to 2 and MAXPKTTOOBIG to 1 (or two).

When the advertised Next-Hop MTU is larger than that, we could use a 
MAXREXMIT of 1, and a MAXPKTTOOBIG of 1.

That is, when the advertised Next-Hop MTU is something reasonable (>500), 
we'd just wait for 1*RTO, which, in most cases, we'll be smaller than a 
couple of seconds.

When the advertised Next-Hop MTU is smaller than that, we should be more 
cautious, and thus would wait a bit more. Perhaps 2*RTO or so.

For advertised Next-Hop MTUs larger than, say, 100 octets, we could set 
MAXSEGREXMIT to 0 and MAXPKTTOOBIG to 1. That is, you'd update the 
estimated MTU as soon as you get the ICMP "fragmentation needed and DF bit 
set" error message.


>With gre-in-IPsec tunnels already it can take >10 round trips to
>discover PMTU.

Could you provide more information about why you say for this specific case 
it might take more than 10 RTT?


>Maybe you should instead suggest keeping state for ``maximum MTU
>pushed through the link so far'' value that starts at 0 for fresh
>connections, and tracks the largest segment ever acknowleged.

Maybe I'm missing something, but for PMTUD, it starts at the MTU of the 
outgoing interface.


>This
>metric along with the existing PMTU measurement would capture the
>MTU-uncertainty of a fresh connection.  For a connection that sends
>short spurts of data and never has inclination to transmit a large
>segment, these two values might never converge.  But a bulk file
>transfer should quickly bring the two values to be equal.  You can
>suggest your wait-for-multiple-retransmits workaround only when the
>ICMP toobig message asks for an MTU smaller than the ``maximum MTU
>ACKed so far''.

I'm not sure if we would be benefited from this behavior in practice. I 
mean, for bulk transfers, you'll always be sending full-sized segments. In 
those cases, even the first ICMP DF would be claiming a Next-Hop MTU larger 
than the "maximum MTU ACKed so far", and thus the resulting behavior would 
be that described in the draft.

The only case in which you'd benefir from this "refinement" is, as you 
mentioned, those cases in which the system usually sends small chunks, but 
at some point it send a full-sized segment. I guess this might scenario 
might take place in a telnet session. However, your refinement means we'd 
discover the *initial* PMTU a bit quicker. I cannot think of any other 
point in the connection at which your "refinement" would kick in.

However, I'll think about this a bit more, and of course we'll be glad to 
include this in the draft if there are clear benefits of this "refined" 
behavior.


>When your algorithm sees enough unsuccessful
>retransmissions, it can knock the ``maximum MTU ACKed so far'' back to
>0 and allow PMTUD to restart.

Well, when you have seen enough unsuccessful retransmission (where 
unsuccessful is anything greater than or equal to MAXSEGREXMIT), you can 
honor the ICMP DF.


>I've long found it annoying when hosts will reset an ESTABLISHED
>connection upon getting an ICMP unreachable.  I appreciate your
>revisiting this nuissance under the ``security'' banner

Well, the banner depends on whether this happens intentionally, or just "by 
chance".
My draft addresses connection-resets that are elicited *intentionally*.

Actually, the draft was born as follows. I'm an active member of the IETF's 
TCPM WG. You know earlier this year there was a lot of discussion about 
that TCP reset vulnerability presented at Cansec. Well, I was working on 
another ICMP thing, and then this idea came up "hey, guys... there's lot of 
discussion about resetting TCP connections by means of spoofed TCP 
segments. But... have you thought about ICMP?". There was already that 
proposal for using MD5 signatures for protecting BGP sessions. However, 
neither that spec, nor the TCP specs said a word about performing security 
checks on ICMP packets.
Several parties did some test on different systems, and it seems people had 
forgotten ICMP attacks. Many (most, I'd say) do not perform checks on the 
TCP sequence number contained in the payload of the ICMP error message. 
That makes those systems vulnerable to the ICMP source quench and the PMTUD 
attacks.
Some systems not only do not perform TCP sequence number checks, but also 
honor ICMP unreachables by aborting the corresponding connection.

I have not been keeping track of the vulnerable systems, but it seems 
Windows systems are vulnerable to the blind connection-reset and the ICMP 
Source Quench attacks (at least). A Csico IP phone was found vulnerable to 
these two attacks, too. This was documented at Cisco's web site, but for 
some reason they have now removed this information OpenBSD honored ICMP 
Source Quench messages, but, IIRC, they now ifnore them. Linux achanged 
this, too. They now ignore ICMP source quench messages. I have been told 
FreeBSD will now ignore ICMP source quench messages, too.


>            o    Destination Unreachable -- codes 2-4
>                  (proto, port unreachable)
>
>                  These are hard error conditions, so TCP SHOULD abort
>                  the connection.
>
>I see your RFC suggests the second case, the hard errors, revert to
>soft for established connections, with some reasonable defense.

Yea. also, for connections in the established state, as discussed int eh 
draft, you should not be getting hard errors.


>Sounds good.  But on Windows ExPee, if I unplug the network cable my
>Putty session is instantly killed.  Isn't that a violation of 4.2.3.9?

Well, not sure if an ICMP error message is generated. But Windows should 
not be aborting the connection, certainly.


>Have you done any testing if the ``soft'' unreachables can also be
>wrongly used to reset established connections on certain TCP stacks?

No. But now that you point this out, I'll check this. (My draft was, in 
principle, addressing those issues on which the specs could be better. 
i.e., it addresses issues

BTW, thanks so much for your feedback!


--
Fernando Gont
e-mail: fernando@gont.com.ar || fgont@acm.org