Subject: kern/13944: ipnat mangles checksum on ICMP Host Unreachables from NATting host
To: None <>
From: None <>
List: netbsd-bugs
Date: 09/13/2001 06:29:08
>Number:         13944
>Category:       kern
>Synopsis:       NAT translation of ICMP Host Unreachable messages resulting from DF-bit-set packets which cannot traverse a gateway doing NAT causes a bad IP checksum, resulting in a PMTUD black hole.
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 13 03:30:00 PDT 2001
>Originator:     Thor Lancelot Simon
>Release:        NetBSD-1.5.1 *and* NetBSD-current as of 2001-09-09
not much.
System: NetBSD 1.5X NetBSD 1.5X (REKUSANT) #65: Sun Sep 9 04:20:20 EDT 2001 i386
Architecture: i386
Machine: i386
A Path MTU black hole occurs when a gateway will not pass a packet with the
DF bit set but fails to return an appropriate ICMP Host Unreachable message
to the originator of the packet.  NAT under NetBSD currently causes such a
black hole condition when the gateway generating the Host Unreachable message
is using ipnat to rewrite addresses.  It appears that though NAT is applied
correctly to the addresses in the payload of the locally-generated ICMP Host 
Unreachable message, the IP checksum is not correctly updated to reflect the
alteration of the packet data.  For example (in this case, "rekord" is a
NetBSD-1.5X machine which is a gateway using ipf and ipnat):

06:22:10.508679 0:60:ef:20:64:34 0:50:4:1b:b5:6a ip 70: rekord > icmp: host unreachable for > [|tcp] (ttl 126, id 1590, len 40, bad cksum bd82!) (ttl 255, id 4094, len 56)

If the host trying to do Path MTU doesn't do a good job at blackhole 
detection (for example, Win2K) it experiences mysterious TCP connection
freezes, at best, and a total inability to communicate with TCP through the
NetBSD gateway at worst..

Oh, just to be clear -- the gateway in question does *not* have an Ethernet
interface in it that does hardware-assisted checksum, and the bad checksums
show up in a tcpdump of the outbound packets from the gateway's interior
interface as well.

Put a host that does Path MTU discovery behind a NetBSD gateway that's
doing NAT.  Watch what happens using tcpdump (note that you must do -v -v
to get the full ICMP information).  You will see the problem more quickly
if you artificially constrain the MTU on the exterior interface of the
NetBSD gateway.

I had a look at the code that NATs ICMP messages, but I couldn't figure out
where messages originating on the local host were processed.  It seems clear
that either fix_datacksum is broken, or something is not calling it as it