Subject: Re: packet loss? w/ 1.6[A-D] & IPSEC policy
To: Arto Selonen <arto@selonen.org>
From: None <itojun@iijlab.net>
List: current-users
Date: 07/24/2002 04:52:05
>I have had IPSEC policies in place since late May 2001, and things
>have worked as expected up to 1.5ZC. After upgrading to 1.6A,
>and continuing to the current 1.6B/1.6D client/server pair I have had
>problems. SSH seems to work without noticeable effects, but
>eg. web surfing from client to server breaks with connections
>eventually timing out, etc.
>
>If my memory serves me right, then this started happening as soon as I
>upgraded the client (10.1.1.1) from 1.5ZC to 1.6A, even though the
>problems *seem* to be at the server end (which first stayed at 1.5ZC
>and was then upgraded to 1.6A, 1.6B and 1.6D without any help).
to repeat the symptom i have configured boxes like below.
policy difference ("use" instead of "require", and the lack of inbound
policy) should not matter.
from some large ping (with DF=1) results, i judge that all
intermediate link MTUs are >= 1500.
i've got the following results:
- short ping (ping 203.178.141.199): no packet drops observed
- long ping (ping -s 2000 203.178.141.199): no packet drops observed
- ping with short interval (ping -i 0.1 203.178.141.199):
340 packets transmitted, 339 packets received, 0.3% packet loss
- long ping with short interval (ping -i 0.1 -s 2000 203.178.141.199):
442 packets transmitted, 170 packets received, 61.5% packet loss
if i replace rijndael-cbc with des-cbc:
- ping with short interval (ping -i 0.1 203.178.141.199):
304 packets transmitted, 304 packets received, 0.0% packet loss
- long ping with short interval (ping -i 0.1 -s 2000 203.178.141.199):
304 packets transmitted, 155 packets received, 49.0% packet loss
i tested some TCP sessions.
- short tcp sessions (like finger) works fine
- i observed tcp connection drops with EMSGSIZE on large scp from
210.160.95.102 (scp netbsd 203.178.141.199:/dev/null), while the
opposite way does not get EMSGSIZE.
from the above, my current obsrevation is as follows:
- IPsec processing is slower than incoming packet rate, interface input
queue gets filled up, and packets get dropped. this is the main
reason for the loss. all ifq_drops are 0, so my guess is that
IPv4 queue length (ipintrq.ifq_maxlen) is exceeded. unfortunately
there's no stat variable for ipintrq overflow.
- there seem to be some differences in error handling in tcp layer
between 1.6D and 1.6_BETA4.
itojun
--- 210.160.95.102
NetBSD starfruit.itojun.org 1.6D NetBSD 1.6D (STARFRUIT) #148: Sat Jul 20 08:36:32 JST 2002 itojun@starfruit.itojun.org:/usr/home/itojun/NetBSD/src/sys/arch/i386/compile/STARFRUIT i386
add 210.160.95.102 203.178.141.199 esp 10000 -E rijndael-cbc 0x0000000000000000000000000000000000000000000000000000000000000000;
add 203.178.141.199 210.160.95.102 esp 10001 -E rijndael-cbc 0x0000000000000000000000000000000000000000000000000000000000000000;
add 210.160.95.102 203.178.141.199 ah 10002 -A hmac-sha1 0x0000000000000000000000000000000000000000;
add 203.178.141.199 210.160.95.102 ah 10003 -A hmac-sha1 0x0000000000000000000000000000000000000000;
spdadd 210.160.95.102 203.178.141.199 any -P out ipsec esp/transport//use ah/transport//use;
--- 203.178.141.199
NetBSD banana.kame.net 1.6_BETA4 NetBSD 1.6_BETA4 (GENERIC) #1: Wed Jul 24 04:03:59 JST 2002 itojun@banana.kame.net:/usr/home/itojun/NetBSD16/src/sys/arch/i386/compile/GENERIC i386
add 210.160.95.102 203.178.141.199 esp 10000 -E rijndael-cbc 0x0000000000000000000000000000000000000000000000000000000000000000;
add 203.178.141.199 210.160.95.102 esp 10001 -E rijndael-cbc 0x0000000000000000000000000000000000000000000000000000000000000000;
add 210.160.95.102 203.178.141.199 ah 10002 -A hmac-sha1 0x0000000000000000000000000000000000000000;
add 203.178.141.199 210.160.95.102 ah 10003 -A hmac-sha1 0x0000000000000000000000000000000000000000;
spdadd 203.178.141.199 210.160.95.102 any -P out ipsec esp/transport//use ah/transport//use;
--- traceroute
% traceroute -q1 203.178.141.199
traceroute to 203.178.141.199 (203.178.141.199), 64 hops max, 40 byte packets
1 entry (210.160.95.110) 5.838 ms
2 210.145.251.162 (210.145.251.162) 119.809 ms
3 210.145.251.161 (210.145.251.161) 27.550 ms
4 211.16.14.217 (211.16.14.217) 28.450 ms
5 211.6.5.66 (211.6.5.66) 30.594 ms
6 210.254.187.53 (210.254.187.53) 28.532 ms
7 61.207.0.158 (61.207.0.158) 28.420 ms
8 61.207.0.222 (61.207.0.222) 28.496 ms
9 210.163.252.189 (210.163.252.189) 28.566 ms
10 foundry2.otemachi.wide.ad.jp (202.249.2.83) 28.901 ms
11 pc3.yagami.wide.ad.jp (203.178.138.245) 29.369 ms
12 hitachi1.k2c.wide.ad.jp (203.178.138.218) 30.877 ms
13 kame199.kame.net (203.178.141.199) 36.653 ms