Subject: Re: IPv6 Comment
To: None <phil@merlin.cs.wwu.edu, smd@ebone.net>
From: Sean Doran <smd@ebone.net>
List: current-users
Date: 09/01/2000 22:40:53
Hm, where do we discuss things like this that actually have to
do with implementing things?  Even ignoring my (hi, B!) "pro-NAT, anti-IPv6
jihad" politics, some of these issues might be interesting to someone
who philosophically wants things to work, warts-in-the-network and all.

I also accept "please just shut up" and silence with a smile. ;>

Most of this will revolve around two problems:

	1/ distinguishing "what" from "where" when they are different
and	2/ coping with the fact that "where" can change over time

In the past, IP addresses were always "what" and "where" - the 32-bit-value
is used to identify a particular machine and also its topological location.

Now a machine can have its addresses change without it becoming a different
machine, and multiple machines can share the same topological locator.

| Consider rpc2 (the underlying protocol for Coda):
|
|    a) it is UDP based (which means NAT mappings are temporary at best
| 	and could easily send a reply packet to the wrong machine with
| 	several hosts sending to the same ip/port.)

This is an interesting implementation issue.

NAT should be a deterministic mapping in the spatial dimension,
with only long-term changes to that mapping.

It is possible, and sometimes desirable to have fewer outside
addresses than things inside, or to overload individual addresses,
thus doing a temporal mapping.  Things that make assumptions about
the lifetime of the address of the "other side" can break, yes.

I would argue that there is probably some widely-agreeable time
frame during which any particular address gotten from the DNS is
valid; I would hope that the DNS ttl on the A RR would be an
indicator of the validity, but this is possibly fantasy.  When 
overloading NAT mappings, one also runs the risk of state loss 
(or just not caring) that causes a believed-to-be-valid-by-another-system
address to be reused.

This is likewise a problem without any NAT at all.   If one is
assigned addresses dynamically, and other parties don't have
knowledge that a change has happened they cannot know that they
should re-query the DNS.

Renumbering easily is one of the touted benefits of IPv6, so perhaps
implementations of rpc2 should take into account not just the spatial
change of address (the receiver might not see the sender as having
the same IP address the sender has), but also the possibility that
the address may change over time.

I for one wish protocols could deal with me getting a new dynamically-
assigned address after a disconnection, but IP in general (v4 and v6)
do very badly in distinguishing "what" from "where".

|    b) it has a "side effect" transport where a UDP packet is sent back
| 	on a different port number than the original packet and thus
| 	the NAT box would have no record of a UDP packet sent on that
| 	port and would have no clue as to where to forward the packet.

Eh?  Most NAT implementations have a deterministic mapping if
there is no temporal address overloading.  It's usually done on
a (inside-host-address,outside-host-address) 2-tuple, translating
only addresses.  NATs where there is overloading, and PATs, can be
very different.  Of course if your mapping is not deterministic,
you can indeed lose this way.

I don't know how to help in that situation, other than hoping
that if something isn't working, the DNS can be consulted again,
and maybe there's a new address, or you have some other way of
notifying other hosts that the "what" to "where" mapping has changed.

| 	One could tell the NAT box to always send the UDP packets on
| 	this one port to a particular machine behind the NAT box, but
| 	then all other machines loose.

Mmm, PAT.  This is tricky.  You end up with multiple "whats" with
the same "where".

This is a compression problem; with NAT you can choose to compress
away only the unused address space, by consuming an address per
"inside" host.  You can do this deterministically or not.
You can get greater but lossier compression by eliminating
hosts which have been quiet for a time.   You can get still
greater compression by putting multiple non-quiet hosts behind
single addresses, using port-substitution (PAT).

The decision about the compression to safety ratio is a local one,
not an architectural one, just like the decision about whether
to compress or not (i.e., use a NAT or not).

I guess as a protocol designer and implementer, whether you
want to take NAT, PAT and friends into account is also a local
issue.   "Just NAT" is pretty easy.  PAT does make things harder.

| 	For me in my home, that translates to ONE coda capable machine
| 	behind the NAT box.  Not very nice.

Yeah, if you have a PAT rule that compresses everything into one address.

Fixing that protocol wise is challenging, because you cannot initiate
a conversation from the "outside" without being psychic or going out of 
band.  

On the other hand an "inside" host can get to you, and you
MAY be able to guess that using the source port plus DNS name will
result in a two-way connection.  It's a pity that you only get to
guess, rather than ask something like the DNS, though.
("coda.server2.inside-a-single-address.com" -> "... IN P 1221")

Personally, I also think it's fair to be sneaky, and try to negotiate
with the single exposed CODA talker that it act as an ALG or proxy for
other CODA-talkers "inside" the PAT.   Likewise, using another mechanism
(e.g. email!) to ask the "inside" host to start the conversation could work.
PAT is not a security technique, and is in no way immune from "covert 
channels", and if it can be worked-around, that's a feature.

| Note, this has nothing to do with IP numbers in the data stream!  So
| this can't be a DNS vs IP issue.  The "remote" rpc2 responds to the
| NAT output IP.  

Eh, why doesn't it respond to an embedded DNS name?

| The problem is that with a single ip/port number, 
| it can talk to one machine for side effects.

Right, several machines can have the same address, because of PAT compression.

So how do you know what machine you're talking to, if they have the same where?

| And to my knowledge, 
| there is not enough information for an ALG either.  

Well, some things break, when others are deployed. :(   

Can another protocol than rpc2 be substituted, or is there
no real way to swap one data-carrying protocol for another?

That'd be too bad, since CODA is kinda neat.

	Sean.