Subject: DF bit copying in gif processing
To: None <thorpej@netbsd.org>
From: Jun-ichiro itojun Hagino <itojun@iijlab.net>
List: tech-net
Date: 09/15/2000 01:12:25
	jason, where is the reason for your change to gif code, namely
	sys/netinet/in_gif.c, "copy DF bit from inner to outer"?
	(revision 1.14 -> 1.15)
	i think we should make it configurable.  if we are to automate the
	configuration, we must check net.inet.ip.mtudisc at least.

	a meta problem here is that there are too many tunnelling specification
	there, and they all state different things (this is not the place
	to discuss, but anyway...)  see comment in sys/netinet/ip_encap.c
	for (probably) full list of tunnelling RFCs which uses ip proto #4/#41.

	details below.

itojun



independent from specification, we have several choices of behavior:
- set-to-0
- set-to-1
- copy-from-inner
- more complex rule

after an encapsulation, if outer DF bit is 0 (first and last cases),
depending on packet loss rate between ingress node and egress node,
tunnelled packet may vanish.  it is also claimed in many RFCs that
multi-level fragmentation can degrade performance.

if outer DF bit is 1 (latter two cases), the tunnel does not play nice
with net.inet.ip.mtudisc=0 case, as ingress node will not be able to
learn about the path MTU of the tunnel path.  to give a detail,
under the following assumption:
- topology is like this
	source ---- ingress ==== egress ---- destination
- source turns on path MTU discovery, and throws packets with DF=1 to
  destination
- ingress node disables path MTU discovery
- given:
	hlen = IP header length (20 for IPv4, 40 for IPv6)
	len = inner (encapsulated) packet length
	pmtu(x, y) = path MTU between x and y (real pmtu)
  and the following formula is satisfied
	len < pmtu(source, ingress)
	pmtu(ingress, egress) < len + hlen
	pmtu(source, ingress) > pmth(ingress, egress)
	pmtu(ingress, egress) < interface MTU of ingress
ingress node has no chance to learn about pmtu(ingress, egress),
and packet will silently vanish between ingress node and egress node,
and source has no chance to be informed about it.


RFC2401 is rather clear about the following (6.1 and appendix B):
- copy-from-inner is optional behavior, and
- an implementation should provide a way to configure DF bit
  processing, like set-to-0, set-to-1, or copy-from-inner.
the RFC looks really confused about path MTU discovery issue;
"propagating path MTU discovery" (6.1.2.1) is very cryptic.
the RFC does not mandate path MTU discovery on ingress node.

RFC2003 says like this:
- set-to-1 if inner DF bit is 1
- set-to-0 or set-to-1 if inner DF bit is 0
and the document mandates ingress node to perform path MTU discovery.

RFC1853 recommends copy-from-inner.
the document mandates ingress node to perform path MTU discovery.

RFC1933 defines a complex rule for DF bit processing (due to the
difference in minimum IPv4 fragment reass buffer and IPv6 minimum
link MTU).  the rule seems to assume that the ingress node can perform
path MTU discovery (RFC1933 is not clear about it).

draft-ietf-ngtrans-6to4-07.txt recommends set-to-0.  the document
is silent about path MTU discovery on ingress node.