tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: dropped routing socket messages...




On 16 May 2008, at 11:16 , J.T. Conklin wrote:

I have a system that uses a user-space routing table implementation.
It uses the kernel routing table as a forwarding table, and installs
and removes routes as needed by the higher level routing policy.  It
also opens a routing socket and monitors it for addresses being added
and deleted from interfaces, interface link state, etc.

Occasionally (less than 1% of system restarts), it seems that routing
table events are being lost which results in my user-space routing
table getting out of sync with reality.

Under what circumstances would this happen? My initial hypotheses were
that the either kernel couldn't allocate an mbuf for the event, or the
routing socket receive buffer wasn't large enough and the even was
dropped then.  But I don't see any "requests for memory denied" in
the mbuf stats (netstat -m); and I've set the socket buffer size to
128K, and the total number of events is way lower than that.

Is there any other likely reasons where routing socket events would
be dropped?  FWIW, this is a NetBSD-4 kernel.

There seems to be another queue between hard interrupt and soft interrupt
level where stuff can also be dropped.  See route_enqueue().

I'd note, however, that the routing socket's best effort design makes
it inherently unreliable, or at least unscalable, for tracking state like that. I've worked on very large systems which attempted to use the routing
socket for that, and the solution for getting a reliable outcome always
ended up being redesigning all the data structures which were associated
with routing socket messages so that state changes could be tracked in
the structures themselves and messages were only formatting and delivered
when the application asked for them.  This is a lot of work.

Dennis Ferguson



Home | Main Index | Thread Index | Old Index