Subject: Re: IPv6 over GRE tunneling?
To: None <tech-net@netbsd.org>
From: Gert Doering <gert@greenie.muc.de>
List: tech-net
Date: 01/25/2005 23:47:11
Hi,
On Sun, Jan 23, 2005 at 09:54:08PM +0100, Gert Doering wrote:
> and I think that modifications to if_gre.c & ip_gre.c to accept
> IPv6-over-GRE should not be too hard.
Actually it was fairly trivial - except for one really nasty bug in
net/if_gre.c that hits everyone trying to send a non-IPv4 packet
over GRE (ip->ip_tos is copied to the new IP header, and "ip" is NULL)
- a bug that will bite XEROX and NS users as well, if there are any.
[-> the parts of my diff related to ip_tos in net/if_gre.c should be
integrated in any case!]
Diffs are appended below (vs. -current), and I'd appreciate if you
could review them.
There is one problem remaining: for IPv6-over-GRE packets, there is a
weird delay upon reception. It's like schednetisr() isn't called,
except that the delay always seems to be about 300-1200 ms, while without
schednetisr(), it's indefinite (that is: until another interface receives
an IPv6 packet, to be precise).
Look at this tcpdump on the LAN interface:
23:19:26.267313 195.30.70.42 > 193.149.48.168: gre 2001:608:4:4444::1 > 2001:608:4:4444::2: icmp6: echo request (len 60, hlim 64) (ttl 255, id 825, len 124)
23:19:27.376332 193.149.48.168 > 195.30.70.42: gre 2001:608:4:4444::2 > 2001:608:4:4444::1: icmp6: echo reply (len 60, hlim 64) (ttl 30, id 520, len 124)
23:19:27.386273 195.30.70.42 > 193.149.48.168: gre 2001:608:4:4444::1 > 2001:608:4:4444::2: icmp6: echo request (len 60, hlim 64) (ttl 255, id 826, len 124)
23:19:27.678095 193.149.48.168 > 195.30.70.42: gre 2001:608:4:4444::2 > 2001:608:4:4444::1: icmp6: echo reply (len 60, hlim 64) (ttl 30, id 521, len 124)
195.30.70.42 is a Cisco router, sending IPv6 pings to 193.149.48.168,
which is the NetBSD machine in question.
The echo request comes in at 23:19:26.267, while the echo reply doesn't
leave before 23:19:27.376 - over a second later.
For the second echo request/reply, the delay is only 300 ms, but still
way too high.
Testing with asymetric tunneling and with IPv4-over-GRE confirms that it's
definitely the receiving path for IPv6-over-GRE (if the tunnel is only
sending, no "weird delays" occur).
The problem *is* related to "schednetisr()" processing in some way.
If I run a "ping6 -i 0.1 $someotherhost" on the LAN, to make sure the IPv6
input queue is permanently serviced, the tcpdump looks different:
23:24:21.600740 195.30.70.42 > 193.149.48.168: gre 2001:608:4:4444::1 > 2001:608:4:4444::2: icmp6: echo request (len 60, hlim 64) (ttl 255, id 846, len 124)
23:24:21.690332 193.149.48.168 > 195.30.70.42: gre 2001:608:4:4444::2 > 2001:608:4:4444::1: icmp6: echo reply (len 60, hlim 64) (ttl 30, id 707, len 124)
23:24:21.700073 195.30.70.42 > 193.149.48.168: gre 2001:608:4:4444::1 > 2001:608:4:4444::2: icmp6: echo request (len 60, hlim 64) (ttl 255, id 847, len 124)
23:24:21.773719 193.149.48.168 > 195.30.70.42: gre 2001:608:4:4444::2 > 2001:608:4:4444::1: icmp6: echo reply (len 60, hlim 64) (ttl 30, id 709, len 124)
- same two machines, identical tunnel configuration, but delay is down
to 10-90 ms. Running "ping -i 0.02 $otherhost" in parallel reduces the
delay further (as is to be expected).
Now I need your help :-) - which part of the code shall I poke, to find
out where these reception delays happen?
System environment: NetBSD/Sparc64, Sun Ultra 5, -current as of yesterday
(2005/01/24).
gert
----------- snip ----------
Index: net/if_gre.c
===================================================================
RCS file: /cvsroot/src/sys/net/if_gre.c,v
retrieving revision 1.54
diff -u -r1.54 if_gre.c
--- net/if_gre.c 6 Dec 2004 02:59:23 -0000 1.54
+++ net/if_gre.c 25 Jan 2005 22:28:26 -0000
@@ -7,6 +7,8 @@
* This code is derived from software contributed to The NetBSD Foundation
* by Heiko W.Rupp <hwr@pilhuhn.de>
*
+ * IPv6-over-GRE contributed by Gert Doering <gert@greenie.muc.de>
+ *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@@ -180,6 +182,7 @@
struct gre_softc *sc = ifp->if_softc;
struct greip *gh;
struct ip *ip;
+ u_int8_t ip_tos = 0;
u_int16_t etype = 0;
struct mobile_h mob_h;
@@ -263,9 +266,14 @@
goto end;
}
} else if (sc->g_proto == IPPROTO_GRE) {
+#ifdef GRE_DEBUG
+ printf( "start gre_output/GRE, dst->sa_family=%d\n",
+ dst->sa_family );
+#endif
switch (dst->sa_family) {
case AF_INET:
ip = mtod(m, struct ip *);
+ ip_tos = ip->ip_tos;
etype = ETHERTYPE_IP;
break;
#ifdef NETATALK
@@ -278,6 +286,11 @@
etype = ETHERTYPE_NS;
break;
#endif
+#ifdef INET6
+ case AF_INET6:
+ etype = ETHERTYPE_IPV6;
+ break;
+#endif
default:
IF_DROP(&ifp->if_snd);
m_freem(m);
@@ -312,7 +325,7 @@
gh->gi_dst = sc->g_dst;
((struct ip*)gh)->ip_hl = (sizeof(struct ip)) >> 2;
((struct ip*)gh)->ip_ttl = ip_gre_ttl;
- ((struct ip*)gh)->ip_tos = ip->ip_tos;
+ ((struct ip*)gh)->ip_tos = ip_tos;
gh->gi_len = htons(m->m_pkthdr.len);
}
@@ -381,6 +394,10 @@
case AF_INET:
break;
#endif
+#ifdef INET6
+ case AF_INET6:
+ break;
+#endif
default:
error = EAFNOSUPPORT;
break;
Index: netinet/ip_gre.c
===================================================================
RCS file: /cvsroot/src/sys/netinet/ip_gre.c,v
retrieving revision 1.30
diff -u -r1.30 ip_gre.c
--- netinet/ip_gre.c 26 Apr 2004 01:31:56 -0000 1.30
+++ netinet/ip_gre.c 25 Jan 2005 22:28:27 -0000
@@ -7,6 +7,8 @@
* This code is derived from software contributed to The NetBSD Foundation
* by Heiko W.Rupp <hwr@pilhuhn.de>
*
+ * IPv6-over-GRE contributed by Gert Doering <gert@greenie.muc.de>
+ *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@@ -145,7 +147,7 @@
gre_input2(struct mbuf *m, int hlen, u_char proto)
{
struct greip *gip;
- int s;
+ int s, isr;
struct ifqueue *ifq;
struct gre_softc *sc;
u_int16_t flags;
@@ -186,22 +188,31 @@
switch (ntohs(gip->gi_ptype)) { /* ethertypes */
case ETHERTYPE_IP: /* shouldn't need a schednetisr(), as */
ifq = &ipintrq; /* we are in ip_input */
+ isr = NETISR_IP;
break;
#ifdef NS
case ETHERTYPE_NS:
ifq = &nsintrq;
- schednetisr(NETISR_NS);
+ isr = NETISR_NS;
break;
#endif
#ifdef NETATALK
case ETHERTYPE_ATALK:
ifq = &atintrq1;
- schednetisr(NETISR_ATALK);
+ isr = NETISR_ATALK;
break;
#endif
+#ifdef INET6
case ETHERTYPE_IPV6:
- /* FALLTHROUGH */
+#ifdef GRE_DEBUG
+ printf( "ip_gre.c/gre_input2: IPv6 packet\n" );
+#endif
+ ifq = &ip6intrq;
+ isr = NETISR_IPV6;
+ break;
+#endif
default: /* others not yet supported */
+ printf( "ip_gre.c/gre_input2: unhandled ethertype 0x%04x\n", (int) ntohs(gip->gi_ptype) );
return (0);
}
break;
@@ -239,6 +250,8 @@
} else {
IF_ENQUEUE(ifq, m);
}
+ /* we need schednetisr since the address family may change */
+ schednetisr(isr);
splx(s);
return (1); /* packet is done, no further processing needed */
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany gert@greenie.muc.de
fax: +49-89-35655025 gert@net.informatik.tu-muenchen.de