Subject: Re: Truly bizarre problem with GRE tunnel.
To: None <current-users@netbsd.org>
From: Lars-Johan Liman <liman@autonomica.se>
List: current-users
Date: 07/03/2006 12:10:30
I'm responding to my own message from 2005-12-03 here. I didn't get
very far last time, although Christos did his best and deserves my
warmest appreciation.

"For external reasons" this issue resurfaced on my stack. I did some
more debuggning today, and came to the following somewhat odd
conclusion:

If I don't have packet filtering turned on, and set up a routing entry
in the kernel, that routes certain packets into the GRE tunnel, that
works fine, *BUT*, if I turn on the packet forwarding engine in ipf
using "express forwarding", then it breaks. The way it breaks is that
the packet length field and the header checksum in the _INNER_ packet
are byte swapped.

In the attached tcpdump file I use 192.71.80.160/28 on the inside, and
I ping from host 192.71.80.163. In the "router" i route 192.71.80.70
explicitly to 192.168.101.2 (address inside remote end of tunnel). The
first two pings succeed.

Then I add the following to my /etc/ipf.conf:

  pass in on fxp0 to gre0 from 192.71.80.160/28 to any

(note the "to gre0" which is the express forwarding).

The two following pings never arrive at the destination, because the
remote tunnel endpoint discards them, due to the issues above.

I would call this a bug, and I would appreciate if someone with IP
stack clue/interface driver clue/ipf clue could spend a cycle or two
to figure out why there is a difference.

My wild guess (and it is exactly that) is that the incoming packet
header is never converted from network byte order to host byte order,
and that works fine as long as you just take the packet from one
interface and dump it on another, but in the GRE interface the
outgoing _INNER_ packed _is_ reordered from host to network in _all_
cases, which swaps the bytes of these particular fields.

The platform where this occured is NetBSD i386 from
ftp.netbsd.org:/.../NetBSD-daily/HEAD/200606080000Z, but the problem
did not start recently.

All hints appreciated.

(This has been send-pr:ed, but is waiting in grey-list quarantine for
the moment.)

				Cheers,
				  /Liman
#----------------------------------------------------------------------
# There are 10 kinds of people in the world. Those who understand
# binary numbers, and those who don't.
#----------------------------------------------------------------------
# Lars-Johan Liman, M.Sc.	! E-mail: liman@autonomica.se
# Senior Systems Specialist     ! HTTP  : //www.autonomica.se/
# Autonomica AB, Stockholm 	! Voice : +46 8 - 615 85 72
#----------------------------------------------------------------------

liman@autonomica.se:
> Some time ago I used to have a GRE tunnel from home to my
> server. Worked like a charm (for the limited value of "charm" that
> applies to tunnels ...).

> Tunnel not used much. Time passed.

> Recently upgraded home to 3.99.11. Server is still at 1.6ZK. Tried to
> re-establish tunnel. Failure.

> After _MUCHO_ debugging (Ethereal Is Your Friend(TM)), I have now
> concluded that:

> At home, on the _OUTGOING_ side, the encapsulated packets are
> fine. (tcpdump on physical interface (tlp0), not tunnel inteface
> (gre0).)

> At server, on the _INCOMING_ side, the same encapsulated packets
> arrive with the "IP length" header field of the _ENCAPSULATED_
> (inner) packet byte swapped. That, and ONLY that, is byte swapped.
> (e.g., 0x0054 becomes 0x5400).

> 21:05:29.728223 82.182.146.229 > 192.71.228.16: gre truncated-ip - 21420 bytes missing! 192.71.228.166 > 192.71.80.70: icmp: echo request seq 288 [tos 0x30] 

> Some diff-serv params of the container (outer) packet are also
> changed, but that's less disturbing.

> What in heaven's name is going on?

> Is ther _ANY_ chance that this pertains to NetBSD? ("Nooooo!" is my
> answer.)

> Tell me that this _HAS_ to be my ISP(s) playing tricks on me. My
> current guess is a bug in some intermediate system, that actually
> tries to de-compile my GRE stuff and poke around inside it. (And if
> so, I have very clear opinions about messing _inside_ my packets ...)

> Anyone else seen this?

> 				Cheers,
> 				  /Liman
> #----------------------------------------------------------------------
> # There are 10 kinds of people in the world. Those who understand
> # binary numbers, and those who don't.
> #----------------------------------------------------------------------
> # Lars-Johan Liman, M.Sc.	! E-mail: liman@autonomica.se
> # Senior Systems Specialist     ! HTTP  : //www.autonomica.se/
> # Autonomica AB, Stockholm 	! Voice : +46 8 - 615 85 72
> #----------------------------------------------------------------------