Subject: Re: default route and private networks
To: Bill Studenmund <wrstuden@netbsd.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 04/26/2005 12:47:27
Bill, Jason, Steve, whoever:

I have to drop out of this discussion for a couple of days.  I wrote
this last night but didn't send it. I _think_ it makes sense;



In message <20050425225851.GD20220@netbsd.org>,
Bill Studenmund writes:


>> I am skeptical this is technically feasible, at least if we agree to
>> consider both `clients' and `servers'. I see no good way to specify
>> all desired policies in the way you propose, short of a firewall-style
>> filter language: write your policies in the languuage, then insert
>> them into the kernel.
>
>Ahh... now we're wandering into implementation questions.

Heck, we were into implementation questions from the get-go.



>Note: I'm about to expand on ideas from the note I just sent, so please=20
>forgive me if I repeat things you haven't had a chance to respond to. :-)

Fair enough.  Same from my end, too. Though if I don't get to them in
this message, it may take a day or two.


>> For my purposes, which are server-oriented, that is overkill.  The
>> performance overhead for interpreting such rules, for every packet
>> sent and recieved, is not acceptable for the kinds of environments I
>> care about -- multiple 1 gig NICs at saturation, or 10GbE NICs.  I'd
>> like to talk about that separately; I will reply separately (a third
>> time, sorry!) about that.
>
>Ok, here are some implementation thoughts. First, this question should
>ONLY be coming up when we don't already have a source address. For TCP,
>once we have connected, we should always be using our IP for the socket.

I would say that very differently: for active-open with unbound TCP
sockets, we only get to choose a local IP address once; thereafteer
the local IP addressed is fixed by the 4-tuple.

For passive opens (that aren't rejected), the local IP address is
specified by the active opener.

I can't tell if that matches what you say; are you maybe thinking
solely of the "client" side, again?

>Likewise, if a UDP send has given an address or any other IP packet
>generation method (protocol, direct creation, etc.) has specified the
>source address, this code should not change it.

Mais d'accord. Emphatic yes. If the local address has been bound
(either explicitly, or by necessity, as with TCP), it's bound.


>Other code may forbid it, apply other policy, or whatever (like you can't
>send from low-numbered ports if you aren't root), but we already have that
>and I'm not suggesting changing it.
>
>So I doubt that this code, if done even moderatly right, would impact the
>environments you describe above; we aren't sending TCP connect packets at
>line rate (well, not in the normal case :-) .

Again, I am thinking multihomed servers. I think you agreed, a few
messages back, that servers could conceivably get more complicated.



>So how about an idea someone else suggested earlier in the thread, that we
>add a field to the routes that indicates a source address to use if no
>other source is specified?

But we need a route that says: send out interface X; but when
selecting an local address for a not-otherwise-bound socket, use a
local address, Y, that's on some other interface.

Hmm. I can construct a scenario that runs into the same problem I
asked about a day or so back (have you thought about that yet)?

Sorry, but I need to think about this more when I'm not ill.

[...]

>> Huh? one of my points all along, is that David's initial proposal
>> *DOES* make things worse --- much worse --- in the kind of networks I
>> deal with on a day-to-day basis. I beleieve I have given both you and
>> Manuel and David examples of just what breaks. Should I repeat them?
>
>I'm sorry. I thought David had the patch doing this change sooner in the
>processing than it is. To the extent that we may choose an address not on
>an interface after having chosen that interface, I agree the patch is bad.
>[snip]





>> Suppose you have an interface with 4 addreses. One of them is removed
>> (due to VRRP, or CARP, or manual removal and later re-addition, or
>> whatever good and sufficent reasons the super-user has at the time).
>> At that point the order of the 4 addresses in the in-kernel list
>> changes.  Thus, outbound IP addresses change, and *stay* changed,
>> until a reboot (or some other process which removes all addresses and
>> adds them in the desired, admin-intended order).
>>
>> My experience is also that you can't stop people from making changes
>> which affect address ordering; and that therefore relying on addreses

"Ordering of addresses"

>> is fundamentally not an good solution for a general-purpose OS.
>
>Let me add an example. Back in the 1.5 era, I had a laptop that would run
>dhclient. Every now and again it'd get a different address (usually after
>it'd been asleep/off site for a while). Certain traffic, such as pings,
>would continue to use the old address, leading to much confusion. Only
>rebooting would fix things. While the old address wasn't being used for
>new connections, it was still around and ping (or some other part of ICMP)
>would pick up on it.


>> Okay. I acknowlege above that David is making progress in the right
>> direction.  But I continue to beleive that adding IPv6-derived
>> ``scoping'' to IPv4 is in violation of the clear intent of RFC-1122,
>
>Do you think it would it be a violation of the intent of 1122 if it
>happened at or before route selection?

I see two quite separate cases: IPv4 zeroconf/ link-local, and RFC-1918.

If we ever claim to implement zeroconf, we should do whatever the I-D
(or RFC) says.  (My reading of the I-D is that it meets Jason's
desiderata, but I may be misreading one or both). If the IETF reserves
that range of addresses for zeroconf/link-local and specifies
scoped(-like) behaviour for them, to guarantee they stay link-local,
then that's what we need to do.


For RFC-1918: If the admin configures strong-ES, or something akin to
it, then the admin clearly doesn't want to route packets out one
interface, with an address that's local but on a different interface.
Thus, it seems to me that for "classic" multihoming, you can't have
both.  Hence, I conclude (as I said from the beginning) that
hardcoding IPv6-insipred ``scoping'' into the kernel is veboten.
Not just bad or very bad, but verboten.

If we're talking about some mechanism that lets admins encode
(configure a policy), that's a matter for consensus.

I tend to prefer a cleaner (non-hacky) implementation of priorities
for local adddresses. I dont recall the details of the old
first-class-address/ alias-address distinction, except that they were
ugly (something like: hacks to ensure all, or at least one, 1st-class
addresses were listed before any aliases, across any sequence of
removal or readdition of 1st-class addresses).

But it is prior art; it was understandable, at least as Jason and I
seem to remember it.

I am very, very leery of routing-table mechanisms. I worry they will
lead to ARP failure on multihomed hosts.  Note that for complete
strong-ES, we really *nee*	
	route lookup
	deafult route
	arp lookup (since its intertwined with routes

to be done on (remote IP addr, local IP addr, [ToS]), exactly as
described in RFC-1122.  Using route table entries may force us into
doing that sooner than we'd otherwise have to.

(PS: did you think about that scenario I sketched and asked if you saw
the problem?)