Subject: Re: default route and private networks
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-net
Date: 04/25/2005 15:58:51
--bjuZg6miEcdLYP6q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Apr 25, 2005 at 12:45:31PM -0700, Jonathan Stone wrote:
>=20
> In message <20050425180959.GA20220@netbsd.org>,
> Bill Studenmund writes:
>=20
> [...]
>=20
> >> Re: not seeing forest for trees: yes, I've been working with policy
> >> issues, and an SO_BINDTODEV-like implementation for some years
> >> now. Personally, I find that to be a *much* better solution for what
> >> you say (in a subsequent message) you want.
> >
> >I disagree. What I want is something that works well for a client. A new
> >call, of any sort, means each client has to be changed to support it.=20
> >want something where I configure the stack to do what I want, and then a=
ll
> >clients do the right thing.
>=20
> I am skeptical this is technically feasible, at least if we agree to
> consider both `clients' and `servers'. I see no good way to specify
> all desired policies in the way you propose, short of a firewall-style
> filter language: write your policies in the languuage, then insert
> them into the kernel.

Ahh... now we're wandering into implementation questions.

Note: I'm about to expand on ideas from the note I just sent, so please=20
forgive me if I repeat things you haven't had a chance to respond to. :-)

> For my purposes, which are server-oriented, that is overkill.  The
> performance overhead for interpreting such rules, for every packet
> sent and recieved, is not acceptable for the kinds of environments I
> care about -- multiple 1 gig NICs at saturation, or 10GbE NICs.  I'd
> like to talk about that separately; I will reply separately (a third
> time, sorry!) about that.

Ok, here are some implementation thoughts. First, this question should=20
ONLY be coming up when we don't already have a source address. For TCP,=20
once we have connected, we should always be using our IP for the socket.=20
Likewise, if a UDP send has given an address or any other IP packet=20
generation method (protocol, direct creation, etc.) has specified the=20
source address, this code should not change it.

Other code may forbid it, apply other policy, or whatever (like you can't
send from low-numbered ports if you aren't root), but we already have that
and I'm not suggesting changing it.

So I doubt that this code, if done even moderatly right, would impact the=
=20
environments you describe above; we aren't sending TCP connect packets at=
=20
line rate (well, not in the normal case :-) .


So how about an idea someone else suggested earlier in the thread, that we
add a field to the routes that indicates a source address to use if no
other source is specified?

Thus we wouldn't add any more complexity to the common case, and an=20
administrator could set whatever he or she wanted by merely updating the=20
routing table.

Perhaps all that would be needed would be to make the -ifa parameter work.=
=20
I'm not sure why it doesn't work now; I suspect something in the IP layer=
=20
isn't looking.

> >We have one netinet and tens to hundreds (to thousands) of clients. Maki=
ng
> >
> >each client have to change for this doesn't scale well. I'd rather we
> >change one thing to get it right rather than thousands of things.
>=20
> >I understand your description of problems with Linux NFS. However I don't
> >see how any change to how addresses are picked will break WORSE than we
> >have now. :-)
>=20
> Huh? one of my points all along, is that David's initial proposal
> *DOES* make things worse --- much worse --- in the kind of networks I
> deal with on a day-to-day basis. I beleieve I have given both you and
> Manuel and David examples of just what breaks. Should I repeat them?

I'm sorry. I thought David had the patch doing this change sooner in the=20
processing than it is. To the extent that we may choose an address not on=
=20
an interface after having chosen that interface, I agree the patch is bad.

[snip]

> Suppose you have an interface with 4 addreses. One of them is removed
> (due to VRRP, or CARP, or manual removal and later re-addition, or
> whatever good and sufficent reasons the super-user has at the time).
> At that point the order of the 4 addresses in the in-kernel list
> changes.  Thus, outbound IP addresses change, and *stay* changed,
> until a reboot (or some other process which removes all addresses and
> adds them in the desired, admin-intended order).
>=20
> My experience is also that you can't stop people from making changes
> which affect address ordering; and that therefore relying on addreses
> is fundamentally not an good solution for a general-purpose OS.

Let me add an example. Back in the 1.5 era, I had a laptop that would run=
=20
dhclient. Every now and again it'd get a different address (usually after=
=20
it'd been asleep/off site for a while). Certain traffic, such as pings,=20
would continue to use the old address, leading to much confusion. Only=20
rebooting would fix things. While the old address wasn't being used for=20
new connections, it was still around and ping (or some other part of ICMP)=
=20
would pick up on it.

> Okay. I acknowlege above that David is making progress in the right
> direction.  But I continue to beleive that adding IPv6-derived
> ``scoping'' to IPv4 is in violation of the clear intent of RFC-1122,

Do you think it would it be a violation of the intent of 1122 if it=20
happened at or before route selection?

> and also in violation of well-established prior art. That's one very,
> very big hammer agians the proposal. But combine that idea with a
> proposal to hardcode specfic, non-sysadmin-configurable heuristics as
> policy, inside the kernel, and you have enough for a reasonable,
> knowledgeable person, to decide that David's proposal, as stated, is
> out-of-court: no further discussion required.  But I could live with
> an optional, explicitly-specified, ordering of which addresses to choose.

I agree that hard-coded policy is bad, especially for this.

Take care,

Bill

--bjuZg6miEcdLYP6q
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFCbXYrWz+3JHUci9cRAm+HAJ0ZBx3RQOXS5dCcuk6VTjAp5kH45QCfdErv
Ydv1ukMuXTZEhdM5gkVSqdk=
=7IPn
-----END PGP SIGNATURE-----

--bjuZg6miEcdLYP6q--