Subject: Re: Playing with routes hangs kernel (A Fix)
To: None <current-users@sun-lamp.cs.berkeley.edu>
From: Mostyn R. Lewis <mrl@teleport.com>
List: current-users
Date: 03/21/1994 13:48:48
> "Michael L. VanLoon -- Iowa State University" <michaelv@iastate.edu> says
> The lockup always seems to come *immediately* after issuing a "route
> delete something something". The machine then requires a power cycle
> (I have no hardware reset switch) to reboot. Anyone willing to track
> this one down?
> "Tom (T.M.) Malaher" <tmalaher@nt.com> says
> Me too.
> ... Note that the problem is inconsistent... sometimes I
> *can* delete the route successfully.
> Funny, when pppd adds and deletes the routes, things work OK, but when
> I start up slip, and have to delete the routes manually things can get
> hosed.
Yes this can cause a kernel page fault or random hang.
When ppp finishes it issues a SIOCDIFADDR ioctl to delete the interface,
equivalent to -
ifconfig ppp0 delete (for ppp unit 0 )
This deletes a kernel table entry for address information (ifaddr
structure) linked to the interface table (ifnet structure).
If you had added a route for the ppp destination such as default, e.g.
route add default pppdest
and now, after ppp has terminated, wished to delete it, e.g.
route delete default
you stand a good chance of a kernel page fault or a random hang because
the routing table for default still has a pointer to the deleted kernel
ifaddr structure. Depending on the re-use state of that memory you live
or die. Death is caused by a function pointer dereference.
A fix for this, in route.c (/usr/src/sys/net) is to scan the linked list
of ifaddr entries for the interface to see if the pointer is still valid.
The diffs for the fix are appended. There are two versions; mute and
verbose. The verbose fix prints information the the system log to verify
states.
To test this, the following sequence will suffice -
#!/bin/sh
set -x
FICTITIOUS_HOST=1.2.3.4
FICTITIOUS_GATEWAY=5.4.3.2
ifconfig ppp0 $FICTITIOUS_HOST $FICTITIOUS_GATEWAY netmask 0xffffff00 up
ifconfig ppp0
route -v add default $FICTITIOUS_GATEWAY
netstat -r
ifconfig ppp0 delete
ifconfig ppp0
netstat -r
route -v delete default
It would be preferable to have ddb in your kernel to catch any catastrophes
if you want to test this.
Finally, I believe this applies to any device used in the above manner.
Mute fix - cut here ----------------------------
*** route.c Mon Mar 21 13:05:33 1994
--- route.c.MRL.MIN Mon Mar 21 13:05:33 1994
***************
*** 356,361 ****
--- 356,362 ----
register struct rtentry *rt;
register struct radix_node *rn;
register struct radix_node_head *rnh;
+ struct ifnet *ifp;
struct ifaddr *ifa, *ifa_ifwithdstaddr();
struct sockaddr *ndst;
u_char af = dst->sa_family;
***************
*** 382,389 ****
panic ("rtrequest delete");
rt = (struct rtentry *)rn;
rt->rt_flags &= ~RTF_UP;
! if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest)
! ifa->ifa_rtrequest(RTM_DELETE, rt, SA(0));
rttrash++;
if (rt->rt_refcnt <= 0)
rtfree(rt);
--- 383,399 ----
panic ("rtrequest delete");
rt = (struct rtentry *)rn;
rt->rt_flags &= ~RTF_UP;
! if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest){
! if(ifp = rt->rt_ifp){
! struct ifaddr *tifa;
! for (tifa = ifp->if_addrlist; tifa; tifa = tifa->ifa_next) {
! if(tifa == ifa){
! ifa->ifa_rtrequest(RTM_DELETE, rt, SA(0));
! break;
! }
! }
! }
! }
rttrash++;
if (rt->rt_refcnt <= 0)
rtfree(rt);
Mute fix - cut here ----------------------------
Verbose fix - cut here ----------------------------
*** route.c Mon Mar 21 13:05:33 1994
--- route.c.MRL Mon Mar 21 13:05:33 1994
***************
*** 356,361 ****
--- 356,362 ----
register struct rtentry *rt;
register struct radix_node *rn;
register struct radix_node_head *rnh;
+ struct ifnet *ifp;
struct ifaddr *ifa, *ifa_ifwithdstaddr();
struct sockaddr *ndst;
u_char af = dst->sa_family;
***************
*** 382,389 ****
panic ("rtrequest delete");
rt = (struct rtentry *)rn;
rt->rt_flags &= ~RTF_UP;
! if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest)
! ifa->ifa_rtrequest(RTM_DELETE, rt, SA(0));
rttrash++;
if (rt->rt_refcnt <= 0)
rtfree(rt);
--- 383,415 ----
panic ("rtrequest delete");
rt = (struct rtentry *)rn;
rt->rt_flags &= ~RTF_UP;
! if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest){
! if(ifp = rt->rt_ifp){
! struct ifaddr *tifa;
! for (tifa = ifp->if_addrlist; tifa; tifa = tifa->ifa_next) {
! if(tifa == ifa){
! ifa->ifa_rtrequest(RTM_DELETE, rt, SA(0));
! printf("rtentry - route ifa_rtrequest done for %s%d\n",ifp->if_name,ifp->if_unit);
! break;
! }
! }
! if(!tifa)
! printf("rtentry - address for %s%d no longer exists\n",ifp->if_name,ifp->if_unit);
! }
! else {
! printf("rtentry - route has empty ifp\n");
! }
! }
! else {
! if(!ifa)
! printf("rtentry - route has empty ifa");
! if(!ifa->ifa_rtrequest)
! printf("rtentry - route has empty ifa_rtrequest");
! if(ifp = rt->rt_ifp)
! printf(" for %s%d\n",ifp->if_name,ifp->if_unit);
! else
! printf("\nrtentry - route has empty ifp\n");
! }
rttrash++;
if (rt->rt_refcnt <= 0)
rtfree(rt);
Verbose fix - cut here ----------------------------
Mostyn
P.S A version of this was also posted to freebsd-hackers@freefall.cdrom.com
NetBSD seems prone to more frequent crashes than FreeBSD - differing memory
usage ?
------------------------------------------------------------------------------