Subject: kern/26529: ipfilter w/ IPSEC and gif(4) loosing packets (looks like a fragmentation issue)
To: None <>
From: None <>
List: netbsd-bugs
Date: 08/03/2004 07:06:50
>Number:         26529
>Category:       kern
>Synopsis:       ipfilter w/ IPSEC and gif(4) loosing packets (looks like a fragmentation issue)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug 03 09:31:00 UTC 2004
>Originator:     Arto Selonen
>Release:        NetBSD-current with sources from ~20040728
NetBSD blah 2.0G NetBSD 2.0G (BLAH) #58: Wed Jul 28 10:16:22 EEST 2004  blah@blah:/obj/sys/arch/i386/compile/BLAH i386

When connecting through the box (acting as gateway/firewall) using IPSEC transport mode (with gif(4) tunnel), large(?) return packets "disappear".

It looks like they reach the box with DF set, and when they need to be
fragmented due to tunneling overhead, get dropped with just some statistics updated, ie. no ICMP/need-fragment, etc.

This was detected after upgrading to ipfilter 4.1.3, but since 4.1.2
had only been running for a fairly short period of time, it is possible
that this problem was present at that time, bot not detected due to
minimal tunnel usage (and thus probably only with fairly small packets).



The problems appear in gw2, which is the only system that has been upgraded since late March (to be able to test ipfilter changes in gw2
against known, working state prior to ipfilter 4.x upgrade on gw2).

There is a transport mode IPSEC between gw1 and gw2, with gw2 ipsec.conf (with hopefully obvious obfuscation for readability):

add <fxp0> <ex0>  esp 1234 -E rijndael-cbc 0xdeadbeef1;
add <fxp0> <ex0>  ah  2345 -A hmac-sha1    0xdeadbeef2;
add <ex0>  <fxp0> esp 3456 -E rijndael-cbc 0xdeadbeef3;
add <ex0>  <fxp0> ah  4567 -A hmac-sha1    0xdeadbeef4;
spdadd <ex0> <fxp0> any -P in  ipsec esp/transport//require ah/transport//require;
spdadd <fxp0> <ex0> any -P out ipsec esp/transport//require ah/transport//require;

Symmetric setup exists at gw1. There is also a gif(4) tunnel between
gw1 and gw2, with gw2 /etc/ifconfig.gif0 shown:

tunnel <fxp0> <ex0> netmask 0xfffffffc

Again, symmetric setup on gw1. To force the use of the gif tunnel, and
thus IPSEC, there also exists routes on gw1 and gw2, with gw1 having:

route add -net <fxp1>

Now, when making a connection from gw1 (or client) to server behind gw2,
the following can be observed (for a HTTP request):

gw1 gif interface shows (only outgoing packets; PR#25796) the handshake
and request (repeated?). At the same time gw2 gif interface shows (again just outgoing packets) the handshake, and probably a response to the request. tcpdump on gw1 ex0 shows 7 AH packets to gw2, and 6 AH packets from gw2 back. This agrees with the tcpdump on fxp0@gw2. Both have continuous seq-numbers (so no IPSEC packets are dropped). Nothing shows up in ipmon logs (and all default block rules are logged).

The following counters change during a failing connection:
gw1 (that might be related):
     - ip: packets not forwardable
     - icmp: calls to icmp_error
     - icmp: destination unreachable (output histogram)
     - icmp: destination unreachable (input histogram)
     - tcp: out-of-order packets
     - ip: packets not forwardable
     - ip: datagrams that can't be fragmented
     - icmp: calls to icmp_error
     - icmp: destination unreachable (output histogram)
     - tcp: out-of-order packets

No related ICMP packets were observed on fxp1@gw2. Since there seems to be at least some problems with icmp_err routines (see PR#26471), this
could be related to that, and IPSEC may be just what triggers it as it
requires fragmentation, etc.

Anything else I could provide/test to help solve this, as I would really
like to have a working VPN connection between gw1 and gw2?

With the setup I have, I can reproduce these at will. Don't know what
the minimum requirements for this might be.