Subject: Re: Stupid ICMP and fragmentation tricks
To: M Graff <explorer@flame.org>
From: None <itojun@iijlab.net>
List: tech-net
Date: 09/21/1999 17:59:39
>It seems people who write firewall rules are idiots these days.  Most
>places recommend blocking "all ICMP" -- which breaks M$'s
>implementation of Path MTU discovery quite nicely.
>Here's the problem.
>I have a shark running NetBSD, which has a GRE tunnel to another
>NetBSD box at home.  The GRE takes some overhead, of course, so
>sometimes packets need to be fragmented.

	Just for clarification.

	remote end ---- netbsd/shark ===== netbsd at home --- your desktop
				      GRE

	1. remote end sends TCP packet sized 1500bytes, with DF bit raised.
	2. netbsd/shark cannot forward it to gre0 as MTU of gre0 is 1450.
	3. netbsd/shark sends icmp too big to remote end.
	4. remote end does not understand icmp too big.
	5. the connection will hang up.

	To cope with this situation, the easiest solution is to locally
	change gre0's mtu to 1500 (change #define in net/if_gre.c), and let
	GRE packet to be fragmented between "netbsd/shark" and "netbsd at
	home".  net/if_gre.c does not copy, or raise, outer DF bit so GRE
	packet will be fragmented.  Although this comes with a performance
	hit or two you have no other choice.

	remote end -> netbsd/shark	IP(DF=1) TCP (size = 1500)
	netbsd/shark -> netbsd at home	IP(DF=0) GRE [IP(DF=1) TCP]
					(size > 1500)
					will be transmitted fragmented
					and then reassembled
	netbsd at home -> your desktop	IP(DF=1) TCP (size = 1500)

>So, what would break if I changed the fragmentation semantics to be
>something like:
>	if (tcp && dont_fragment_set && must fragment) {
>		send ICMP packet
>		fragment and send to host anywat
>	} else {
>		normal behavior here
>	}

	DF bit is DF bit.  Do not fragment a packet with DF bit set.

itojun