NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/52074: -current npf map directive broken



On 08/05/2017 00:12, Robert Elz wrote:
    Date:        Sun, 7 May 2017 17:17:28 -0400
    From:        christos%zoulas.com@localhost (Christos Zoulas)
    Message-ID:  <20170507211728.8AF0A17FDA8%rebar.astron.com@localhost>

  | On May 7,  8:50pm, roy%marples.name@localhost (Roy Marples) wrote:

  | | The idea is that if we claim to send from an address it has to be valid,
  | | but allow the NULL address if forwarded from the filter.
  | |
  | | Does this make sense?
  | | The same path probably needs adjustment in inet6.
  |
  | Sure, go for it. Why not put all the logic in ip_ifaddrvalid then?

While you are fixing this, please also fix ...

                         * Address exists, but is tentative or detached.
                         * We can't send from it because it's invalid,

Tentative addresses are not invalid, and we *have* to be able to send
from them, without that DAD does't work correctly.

DAD happens at the ARP level, not the IP level.
Different checks are in place there.

Consider a (perhaps unlikely, but possible) case where 2 systems attempt
to allocate the same address at approximately exactly the same time
(broken dhcp server, or common reboot after power fail and they have both
been configured to use the same addr by a broken sys-admin).

In that case they both make the addr tentative, each then starts DAD by
sending a "does anyone own..." query.   Each receives the other's DAD
packet.   The protocol then requires that, as we have that address, even
as a tentative one, we must reply (from the tentative address) and
claim the addr - both systems should do that, both receive the other's
defence of the addr, DAD fails (on both) and they both abandon the address.

Address defence only happens once DAD has passed.
In the scenario you describe this has not occured for either host.
If both recieves each others initial probe then both will abort DAD and mark as duplicate.
This is described in RFC5227, 2.1.1:

   In addition, if during this period the host receives any ARP Probe
   where the packet's 'target IP address' is the address being probed
   for, and the packet's 'sender hardware address' is not the hardware
   address of any of the host's interfaces, then the host SHOULD
   similarly treat this as an address conflict and signal an error to
   the configuring agent as above.  This can occur if two (or more)
   hosts have, for whatever reason, been inadvertently configured with
   the same address, and both are simultaneously in the process of
   probing that address to see if it can safely be used.

And RFC5227, 2.4:
   (b) If a host currently has active TCP connections or other reasons
       to prefer to keep the same IPv4 address, and it has not seen any
       other conflicting ARP packets within the last DEFEND_INTERVAL
       seconds, then it MAY elect to attempt to defend its address by
       recording the time that the conflicting ARP packet was received,
       and then broadcasting one single ARP Announcement, giving its own
       IP and hardware addresses as the sender addresses of the ARP,
       with the 'target IP address' set to its own IP address, and the
       'target hardware address' set to all zeroes.  Having done this,
       the host can then continue to use the address normally without
       any further special action.  However, if this is not the first
       conflicting ARP packet the host has seen, and the time recorded
       for the previous conflicting ARP packet is recent, within
       DEFEND_INTERVAL seconds, then the host MUST immediately cease
       using this address and signal an error to the configuring agent
       as described above.  This is necessary to ensure that two hosts
       do not get stuck in an endless loop with both hosts trying to
       defend the same address.

But if we are not able to send from the addr, because it is considered
"invalid" then neither defends its addr, neither receives the other's
defence, and both then conclude that the addr is safe, and both (eventually)
reset the tentative flag, and chaos ensues.

This is not what happens.
DAD has not passed, there is no address to defend, it's just marked DUPLICATED and stays that way until manual intervention.

What we should be doing with tentative addreses is allowing them to be
used (for any purpose), but just not announcing them to the rest of the
system (except to tools like ifconfig and netstat) until DAD has
completed.   That means no other process will discover that the addr
exists, and so will not start to use it (nor will TCP accidentally see
the addr as existing, and start to use it) and so the incentive that
led to prohibiting sending from tentative addrs, or considering them
to be invalid, is gone.

On a bad day, DAD on IPv4 can take about 10 seconds to complete.
That's a big window to run ifconfig or getifaddrs in to find and start using these non announced addresses.
Consider the wireless interface

dhcpcd starts, sees all interfaces are currently down, forks right away bootup continues
more daemons start up
wifi comes up, dhcpcd is offered an address and adds it
ntpd starts at this point and tries to start using the tentative address.

It also permits applications that really need to start running quickly, and
know how to deal with addresses that appear and disappear, to start using
the tentative addr before it is made permanent, if they like (and if they
can find some mechanism - perhaps using whatever hook ifconfig/netstat use -
to discover that the addr exists.)

If the application really neeeds to start running quickly, it can disable DAD like so:
net.inet.ip.dad_count=0

Otherwise it can wait for the address to finish DAD and then it can start using it.

Roy


Home | Main Index | Thread Index | Old Index