Subject: Re: CVS commit: src/sys
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-kern
Date: 12/08/2004 11:31:18
    Date:        Tue, 07 Dec 2004 11:20:09 -0800
    From:        Jonathan Stone <jonathan@dsg.stanford.edu>
    Message-ID:  <E1Cbksb-00011i-00@smeg.dsg.stanford.edu>

  | Are you sure?

Pretty sure yes.    I mean, why would someone explicitly make this
change ...

81c83
< struct        ifnet loif;
---
> struct        ifnet loif[NLOOP];

unless the intent was to go from supporting just one loopback interface
to supporting more than one.

What's more, the log entry for the check in that did that says ...

  revision 1.4
  date: 1993/05/07 09:27:52;  author: cgd;  state: Exp;  lines: +20 -12
  patch for multiple loopback interfaces (via "pseudo-device loop 2", etc.)
  from David Burren <davidb@otto.bf.rmit.oz.au>

Which is pretty clear to me what the intention was.

  | Robert, maybe I'm wrong, but I seem to recall a time
  | when new-config supported just needs-count, not needs-flag;

Sure.   But what has that to do with anything.   That change just made
it easier to detect bugs (in the user config file) in cases where only
one X was possible, or made sense, and the user was attempting to
request more.    As best I can tell (with no exhaustive search) everything
that could be switched to needs-flag was switched, once it was
available.   The things that were supposed to work with multiple
instantiations (or needed a numeric option for other reasons) were
left with needs-count.   That includes "pseudo-device loop".

If you didn't want this, it looks as if the time to argue was 1993
(way before I started using NetBSD, which is part of why I said "always"
in an earlier message) not now.

  | and that
  | the implementation we had didn't really work right with NLOOP > 1.

No comment.    But assuming that is correct, so what?   Lots of code has
bugs, the answer to that is generally to fix the bugs, not to discard the
code.

  | Huh?  That's *NOT* what loopback does. See smb's mention of a separate
  | Cisco-style nullif.

I know that isn't what it means, and if smb is going to donate a null
interface to serve the purpose, then fine.    My message was before he
mentioned that.   Without an interface like that (which I kind of assume
is likely to be a very close sibling of the loopback interface, without
the loopback) making the loopback interface (a loopback interface) act
that way, perhaps using one of the IF_LINK flags wouldn't be difficult -
a hack perhaps, even an ugly hack, but useful.

  | I'm also not any claims about using multiple loopback devices to count
  | various kinds of packets. There are too many places where kernel code
  | just grabs lo0 (or lo0ifp, now).

Sure, the "primary" loopback interface is special, required, and can't
be deleted or used for other purposes.   (That's why I asked about having
the "pseduo-device" line removed from config files - at the minute it is
simply a mandatory line every config file must have - its one reason for
existing for the past 11 years has been so it could have a count on it.
That reason is now gone).

That has no impact whatever on use of other loN interfaces for other purposes.
Those are just normal software type interfaces (like tunN or gifN or ...)

  | <shrug>.  Maybe so. I'm still unconvinced that's necessary, or
  | correctly implemented, or useful.

Again, there are two totally separate issues here.

First, whether multiple loopback interfaces is useful or not (and to
that you can add a sub-discussion about whether they're currently
correctly implemented).

And second, given that multiple loopback interfaces are (intended to be)
supported, which they clearly are, and have been for many years (bug free
or not), should that be via a count in the config file, or via use of
a cloning interface.   No-one so far (not even you) has argued for keeping
the count in preference to using a cloning interface.   That is all this
has all been about.

  | I think I'd sooner have one correct lo0,

Sure.   Have you any evidence at all that you don't?   Do you think no-one
has yet tested a current kernel from this week?   Given that the loopback
interface (lo0 anyway) is fundamental to the way the network code works,
don't you think we'd have heard by now if it was broken?

  | than start advertising cloning loopback devices that don't work
  | right in various corner cases.

If there are bugs in loN for N > 1, then let's find and fix them.
If.

  | Diving in and making the config-time->dynamic
  | change (with an explicit rejection of any discussion) makes it very hard
  | to achieve that.

No, in this case it was the right way.    First, because the actual change
is clearly the right thing to do (it has nothing whatever to do with
making more loopback interfaces), and second because it makes it much simpler
for people to try additional loopback interfaces - no kernel recompile
& reboot needed to test it, which makes it far more likely that it is going
to get tested, and any bugs found.

When that happens, if someone comes upon an issue where having multiple
loopback interfaces causes a problem that isn't just a code bug, but
actually means that other problems are introduced that are hard to deal
with - all because of having multiple loN interfaces configured (sensibly),
then perhaps we can go back and rip out the cloning code, and return
if_loop.c back to revision 1.3 (1992 vintage), and see if that is better.

Until then, what's the point of complaining, as best I can tell, if you
only make lo0, and never create lo1 (or any other loN for N > 0) nothing
has really changed, so it isn't going to affect anyone adversely.

kre