Subject: Re: down interfaces, link detection, and connected routes
To: Miles Nordin <carton@Ivy.NET>
From: Greg Troxel <gdt@ir.bbn.com>
List: tech-net
Date: 01/19/2005 09:50:56
Miles Nordin <carton@Ivy.NET> writes:

> The problem is, if the kernel takes the cloning route away, packets
> will follow the default route, won't they?  What does Cisco do when it
> loses link-detect to a network, but has a ``gateway of last resort''
> on another interface?  Does it call the network unreachable, or use
> one of its matching shorter-prefix routes?

Sure, or the prefix that matches without the cloning route.  This
seems sane; it seems to be the best bet in general, assuming one
believes 'link definitely down' means no good could possibly come of
sending a packet.

My understanding is that Cisco routers will remove the route from the
FIB, and thus use whatever other routes are in the RIB, which would be
for the same prefix (e.g. to a neighbor learned via OSPF), or a
shorter prefix.  Basically, it seems they consider that they don't
have the non-link-detected interface.

> I think it is good to keep it so RIB and FIB are separate, and the FIB
> should not contain any inactive routes.  For every prefix the FIB
> should contain one REJECT route, one BLACKHOLE route, or n equal-cost
> forwarding routes.

Or no route at all.

> Inactive routes, like REJECT routes where we have
> a higher-priority/lower-metric feasible route, belong in RIB only, and
> not in the NetBSD kernel FIB.  userland daemon will store them in RIB
> and swap them in and out of FIB as appropriate.
> 
> If you agree with this, then:
> 
>  If losing link-detect is going to imply a REJECT route, so packets
>  destined to the network without link-detect will NOT follow a route
>  with a shorter prefix (default route), then the userland daemon
>  should manage the whole thing.

I wasn't thinking this at all.

>  If it is ok for packets to follow default or shorter-prefix route,
>  then the kernel should simply remove the cloning route and all routes
>  cloned from it whenever link-detect goes away.

Sure, that sounds fine.  Also on 'ifconfig down'.

> I don't know what makes more sense, or if we are trying to copy Cisco
> or what.

I am trying to solve the problem of a machine with several interfaces
not being able to forward to the prefix that belongs on one of them
during a link-down condition, when really there is another route in
the RIB.  This really happened to me, and it broke connectivity -
someone moved one (of a set of redundant) wireless/wired routers from
the stable location to a conference room, and that box blackholed the
traffic.  Once I did ifconfig down/ifconfig delete on the disconnected
wired interface, connectivity was restored.

> for me, just so long as whatever gets done, the whole system with
> quagga and dhclient and crufty ifwatchd scripts and whatever else
> watches link-detect doesn't cause open TCP connections to get
> instantly shut down when link detection is lost like on Windows ExPee,
> I'm happy.

Sure, but that seems to be a wholly separate issue.  TCP shouldn't
close a connection due to not having a route.  I run lots of systems
(23, of which 20 are wireless only) that have no static default route,
and only have one when they have connectivity (and OSPF adjacencies).
TCP connections should survive routing changes, regardless of link
detect.  So if this is broken it needs a more generic fix.

If you just mean "don't copy this particular MS brokenness", fine - I
have no intention of that.

-- 
        Greg Troxel <gdt@ir.bbn.com>