Subject: IPv6 problem, UltraSparc-specific?
To: None <port-sparc64@netbsd.org>
From: Stephane Bortzmeyer <stephane@sources.org>
List: port-sparc64
Date: 06/26/2007 13:46:03
[Posted here because it works on i386/NetBSD 3.1 and fails on
UltraSparc/NetBSD 3.1.]

On my home LAN, only one machine cannot run IPv6 properly:

% curl -6 -v http://www.afnic.fr 
* About to connect() to www.afnic.fr port 80 (#0)
*   Trying 2001:660:3003:2::4:20... connected
* Connected to www.afnic.fr (2001:660:3003:2::4:20) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.16.1 (sparc64--netbsd) libcurl/7.16.1 OpenSSL/0.9.7d zlib/1
.1.4 libidn/0.6.11
> Host: www.afnic.fr
> Accept: */*

[Then nothing, transfer is stuck]

All the other boxes on the network can do this test, including the
other NetBSD 3.1 box.

The offending machine can use ping6 and traceroute6 and they
work. Only TCP transfers have the problem (tested above with curl but
echoping has the same issues.)

It does not seem a MTU problem. My router, a Debian/Linux "sarge"
(connected with PPPoE/ADSL) runs radvd and advertises:

   AdvLinkMTU 1460;

and all the other boxes (i386/NetBSD, i386/Gentoo/Linux,
i386/Debian/Linux) seems happy (I did not set manually the MTUs, I
rely on "TCP MSS clamping" for IPv4 and on RA's AdvLinkMTU for IPv6).

tcpdump on the router shows that the transfers stops before the big
packets (2001:660:3003:2::4:20 == www.afnic.fr):

13:33:43.088932 2001:7a8:7509:0:a00:20ff:fe99:faf4.65527 > 2001:660:3003:2::4:20.80: S 1901257151:1901257151(0) win 32768 <mss 33076,nop,wscale 0,sackOK,nop,nop,nop,nop,timestamp 0[|tcp]> [flowlabel 0x6a261]
13:33:43.137401 fe80::204:75ff:fece:efbe > ff02::1:ff99:faf4: icmp6: neighbor sol: who has 2001:7a8:7509:0:a00:20ff:fe99:faf4
13:33:43.137638 fe80::a00:20ff:fe99:faf4 > fe80::204:75ff:fece:efbe: icmp6: neighbor adv: tgt is 2001:7a8:7509:0:a00:20ff:fe99:faf4
13:33:43.137662 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:a00:20ff:fe99:faf4.65527: S 644763915:644763915(0) ack 1901257152 win 5712 <mss 1440,sackOK,timestamp 1815943791 0,nop,wscale 5>
13:33:43.137861 2001:7a8:7509:0:a00:20ff:fe99:faf4.65527 > 2001:660:3003:2::4:20.80: . ack 1 win 32768 <nop,nop,timestamp 0 1815943791> [flowlabel 0x6a261]
13:33:43.138655 2001:7a8:7509:0:a00:20ff:fe99:faf4.65527 > 2001:660:3003:2::4:20.80: P 1:150(149) ack 1 win 32768 <nop,nop,timestamp 0 1815943791> [flowlabel 0x6a261]
13:33:43.196218 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:a00:20ff:fe99:faf4.65527: . ack 150 win 212 <nop,nop,timestamp 1815943849 0>

[Then nothing]

The offending machine is an UltraSparc, which seems its only
peculiarity:

% uname -a
NetBSD preston 3.1 NetBSD 3.1 (PRESTON) #1: Sun Feb 11 11:53:51 CET 2007  root@preston:/usr/obj/sys/arch/sparc64/compile/PRESTON sparc64

hme0: flags=8a63<UP,BROADCAST,NOTRAILERS,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
        capabilities=66<TCP4CSUM,UDP4CSUM,TCP4CSUM_Rx,UDP4CSUM_Rx>
        enabled=0
        address: 08:00:20:99:fa:f4
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active
        inet 172.19.1.2 netmask 0xffffff00 broadcast 172.19.1.255
        inet6 fe80::a00:20ff:fe99:faf4%hme0 prefixlen 64 scopeid 0x1
        inet6 2001:7a8:7509:0:a00:20ff:fe99:faf4 prefixlen 64

Here is the same tcpdump, with the other NetBSD 3.1 machine, an i386
which works fine:

13:35:32.703105 2001:7a8:7509:0:216:3eff:fe78:b525.65513 > 2001:660:3003:2::4:20.80: S 4227314701:4227314701(0) win 32768 <mss 1400,nop,wscale 0,sackOK,nop,nop,nop,nop,timestamp 0[|tcp]> [flowlabel 0xbbb6b]
13:35:32.755655 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:216:3eff:fe78:b525.65513: S 764923309:764923309(0) ack 4227314702 win 5712 <mss 1440,sackOK,timestamp 1816053405 0,nop,wscale 5>
13:35:32.764543 2001:7a8:7509:0:216:3eff:fe78:b525.65513 > 2001:660:3003:2::4:20.80: . ack 1 win 33600 <nop,nop,timestamp 0 1816053405> [flowlabel 0xbbb6b]
13:35:32.765007 2001:7a8:7509:0:216:3eff:fe78:b525.65513 > 2001:660:3003:2::4:20.80: P 1:150(149) ack 1 win 33600 <nop,nop,timestamp 0 1816053405> [flowlabel 0xbbb6b]
13:35:32.822594 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:216:3eff:fe78:b525.65513: . ack 150 win 212 <nop,nop,timestamp 1816053473 0>
13:35:32.845134 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:216:3eff:fe78:b525.65513: . 1:1389(1388) ack 150 win 212 <nop,nop,timestamp 1816053481 0>
13:35:32.856199 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:216:3eff:fe78:b525.65513: . 1389:2777(1388) ack 150 win 212 <nop,nop,timestamp 1816053482 0>
13:35:32.903299 2001:7a8:7509:0:216:3eff:fe78:b525.65513 > 2001:660:3003:2::4:20.80: . ack 2777 win 32212 <nop,nop,timestamp 1 1816053481> [flowlabel 0xbbb6b]
13:35:32.967111 2001:660:3003:2::4:20.80 > 2001:7a8:7509:0:216:3eff:fe78:b525.65513: . 2777:4165(1388) ack 150 win 212 <nop,nop,timestamp 1816053605 1>

[And many more packets]