current-users: IPSEC-related fragmentation issue?

Subject: IPSEC-related fragmentation issue?
To: None <current-users@netbsd.org>
From: Arto Selonen <arto@selonen.org>
List: current-users
Date: 03/29/2004 21:11:21
Hi!

I have the following type of situation:

	A(ep0) <-->IPSEC--> (fxp0)GW(fxp1) <-->plain--> (fxp0)B

All three are NetBSD-current systems; A is from Dec 4th, B is from Mar 1st,
and GW was upgraded this morning (from whatever sources anoncvs
fi-mirror gave around 09:30 EEST). Before the upgrade GW was from sources
around Feb 25th.

The transport mode IPSEC-tunnel uses 10/8 addresses internally, so there
is also NAT involved on fxp1. This setup has been in use since 10/2001,
and has worked without problems (unless introduced by OS bugs, etc).

After the upgrade, I've observed the following:

	- ssh from A to B works, but fairly soon stops
	- running 'tcpdump -i interface ip-of-B' on GW shows:
		20:01:02.946013 B.ssh > GW.1084: . 1:1461(1460) ack 104 win 33580
		20:01:09.737922 GW > B: icmp: ip reassembly time exceeded

The same problem appears on most TCP-based connections to systems behind
fxp1. Sooner or later, GW will send them "ip reassembly time exceeded",
and after that nothing is accepted, and thus the connection becomes
effectively dead.

I've also seen the following "death":

	20:38:36.782326 B.ssh > GW.1098: . 7388:8848(1460) ack 2913 win 33580 (DF)
	20:38:36.782371 GW > B: icmp: GW unreachable - need to frag

I haven't noticed any problems with SSH connections from B to A (and I use
that to forward some A-local ports which carry mp3 streams; they come in
just fine). Similarly, no problems have been observed with other
systems connecting through GW to fxp1-connected systems. So, it looks
like IPSEC/NAT related thing.

Currently, I can produce these at will. Looking at the source changes
I can assume one of the following to have a part in this:

	- IPfilter 4.1.1 less than two days ago
	- various IPSEC changes during previous weeks

Whatever the change, it probably took place in March, and is now breaking
packet flow through GW (or then A/B are broken, but I'd like to know
before upgrading them to possibly broken -current, too). It looks like
a PMTU/PMTUD type problem, or maybe a bug in packet size calculations?

Any ideas?


Artsi
-- 
#######======------  http://www.selonen.org/arto/  --------========########
Everstinkuja 5 B 35                               Don't mind doing it.
FIN-02600 Espoo        arto@selonen.org         Don't mind not doing it.
Finland              tel +358 50 560 4826     Don't know anything about it.