Subject: Re: Kernel Suspects Partitioning
To: None <tech-net@netbsd.org>
From: Brad du Plessis <bradd@cat.co.za>
List: tech-net
Date: 01/20/2004 10:53:30
> RTM_LOSING: Kernel Suspects Partitioning: len 112, pid: 0, seq 0, errno=
 0
> , flags:<UP,GATEWAY,DONE,STATIC>

I'm seeing this error message when I do the following:

Consider this setup:

A ----- B =3D=3D=3D=3D=3D C

Box A and B are on the same subnet, box B has made a dialup to box C. Box=
 A
has a route setup to box C via B, and box C has a route setup to box A vi=
a B.

Now from time to time (it seems to be random) following a dialup, I am un=
able
to connect directly from A to C, it seems the only way to then fix this, =
is
to do a ping in the reverse direction (from box C to A), and then the
connection from A to C succeeds.

The only way I've been able to consistently reproduce this error, is by
 making the dialup and doing a ping from A to C, then on box B I delete t=
he
 route that pppd sets up to get to C and add it again. For example: if th=
e IP
 addresses on the ppp interface after the dialup are:

B: 1.1.1.1
C: 1.1.1.2

then I do the following on box B:

route delete 1.1.1.2
route add 1.1.1.2 1.1.1.1

=46rom this point on a ping from A to C will fail, even if I close the pp=
p
connection and reopen it. At this point I see the RTM_LOSING message in
"route monitor".

A traceroute on A reveals that the packets on box B from A are being
redirected to B's default gateway. If I manually delete B's default gatew=
ay,
the ping from A to C will then succeed.

I have since done a bit of digging in the kernel and found the following:

in netinet/in_pcb.c the function in_losing(...) is called when a packet f=
rom
 A is redirected to B's default gateway. Now it appears that the route se=
tup
 by pppd is not dynamic (the flag RTF_DYNAMIC is not set), and doing an
 rtrequest to delete the default route and add it again seems to fix this
 problem for a subsequent connection.

in the in_losing function the changes are as follows:

if (rt->rt_flags & RTF_DYNAMIC) {
    (void) rtrequest (RTM_DELETE, rt_key(rt),
                           rt->rt_gateway, rt_mask(rt), rt->rt_flags,
                           (struct rtentry **)0);
}
else {
    (void) rtrequest (RTM_DELETE, rt_key(rt),
                           rt->rt_gateway, rt_mask(rt), rt->rt_flags,
                           (struct rtentry **)0);

   (void) rtrequest (RTM_ADD, rt_key(rt),
                           rt->rt_gateway, rt_mask(rt), rt->rt_flags,
                           (struct rtentry **)0);

    rtfree(rt);
}

My question is can anyone tell me if doing this will cause problems
 elsewhere, and if this is not a good solution, can anyone suggest a
 solution?

Thanks,
Brad

-------------------------------------------------------