Subject: Re: Default value of net.inet.ipsec.dfbit breaks PMTU over IPsec tunnels
To: Jason Thorpe <thorpej@wasabisystems.com>
From: Greg Troxel <gdt@ir.bbn.com>
List: tech-net
Date: 05/28/2004 08:17:08
I was surprised to find that I had to turn net.inet.ipsec.dfbit
on manually.  I was losing across an IPsec VPN, and setting dfbit to 1
caused PMTU-D to work.

The current 2401bis draft (02) simply says that whether the DF bit in
tunnel headers is copied, cleared, or set should be configurable.
(2401bis has a note indicating that dfbit processing should perhaps be
per SPD entry, but our SPD implementation is already less powerful
than 2401 requires, and this seems minor - and a PF_KEYv2 interface
change issue.)  Cisco seems to default to copy, but it's hard to tell
from their web documentation.

While I can see the case for letting the tunnel reassemble the ESP
packet, this is arguably not respecting the intent of the original DF
bit.  So I think we should copy the DF bit by default, and it seems
this just requires setting the dfbit sysctl value to 2 at
initialization time.

People that have problems with filtered ICMP can either set dfbit back
to 0 and take the fragmentation performance hit, or probably PMTU
blackhole detection on the client side should fine the IPsec-reduced
MTU.

It can be argued that mtudisc defaulting to on and dfbit set to 2
should be linked.  My 1.6.2ish systems don't have mtudisc on by
default, but my currentish ones do.

Index: sys/netipsec/ipsec.c
--- ipsec.c.~1.1.1.2.~  2004-01-27 20:35:31.000000000 -0500
+++ ipsec.c     2004-05-28 08:15:38.000000000 -0400
@@ -110,7 +110,7 @@
 /* NB: name changed so netstat doesn't use it */
 struct newipsecstat newipsecstat;
 int ip4_ah_offsetmask = 0;     /* maybe IP_DF? */
-int ip4_ipsec_dfbit = 0;       /* DF bit on encap. 0: clear 1: set 2: copy */
+int ip4_ipsec_dfbit = 2;       /* DF bit on encap. 0: clear 1: set 2: copy */
 int ip4_esp_trans_deflev = IPSEC_LEVEL_USE;
 int ip4_esp_net_deflev = IPSEC_LEVEL_USE;
 int ip4_ah_trans_deflev = IPSEC_LEVEL_USE;

-- 
        Greg Troxel <gdt@ir.bbn.com>