Subject: Re: IP-in-TCP?
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Daniel Carosone <>
List: tech-net
Date: 02/02/2005 16:41:57
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Wed, Feb 02, 2005 at 12:02:23AM -0500, der Mouse wrote:
> I find myself needing to encapsulate IP in TCP, for reasons not really
> worth going into here (the principal part is a NAT box whose
> configuration is out of my control).

Does it have to be TCP? Could UDP work as well?

> I seem to recall seeing, in the past, various statements that doing
> this leads to bad interactions between timing algorithms at various
> levels.

Timing is a small part of it, but it's basically a loss issue (and a
loss, all up).

Posit a TCP connection between a and d, that traverses a tunnel
between b and c in the middle. An outer TCP packet b->c gets dropped
somewhere in the middle; this was also carrying some part of tcp data
between a and d.  Both TCPs will try to do error recovery and
retransmit, and - most significantly - none of a's recovery will work
until b's does, and then the whole lot comes through. Now, what
happens when more than one TCP session is being tunneled?

> What I'm wondering is, can someone describe how best to
> minimize such effects?  To pick one simple example, should TCP_NODELAY
> be turned on or off on the tunnel connection?

About the best you can do is crank down the tcp timers on a/d much
further, so they basically try to do error recovery very slowly.  This
can help, but only somewhat, and then loss between a-b or c-d causes
long timeouts instead.

The very nature of the TCP tunnel as being reliable and in-order is
the problem, for tunneled protocols which assume unreliable and
unordered transport.

If you can do UDP, there's a whole range of IP tunnelling
options.. l2tp, IPSEC NAT-T, and others, each with varied suitability
and caveats, none of which attempt to provide more than the underlying
network in terms of reliability and ordering.  They can have
interactions with TCP too, but they're mostly of the PMTU kind. (NAT-T
can run over tcp, too, with the limitations above).

If it really has to be TCP bridging the problem gap, consider linking
rather than nesting TCP's: something like SSH port forwarding, or a
http or socks proxy. That will get you much better results, provided
you don't really need end-end routing.

Failing all that, my suggestion would be to dissolve the problem by
recreating the underlying network behaviour.  Set up tun(4) or similar
devices at each end, and write a tunnelling daemon that opens *lots*
of TCP streams between b-c, spreads all the tunnelled packets evenly
across them, and doesn't try to do any more recovery than opening a
new stream when one closes unexpectedly.

Such a system will end up duplicating 'lost' packets most of the time,
but as long as you have more open tunnel streams than real ones (or
more tunnel streams than packets lost, scaled by some factor), the
inner TCPs shouldn't get stuck behind outer TCP recovery.


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.0 (NetBSD)