Re: Routing socket issue?

To: Roy Marples <roy%marples.name@localhost>, Paul Goyette <paul%whooppee.com@localhost>
Subject: Re: Routing socket issue?
From: Frank Kardel <kardel%netbsd.org@localhost>
Date: Sun, 31 Jan 2021 08:58:48 +0100

Hi Roy!


On 01/31/21 03:27, Roy Marples wrote:

On 30/01/2021 22:01, Frank Kardel wrote:
"why it needs to be interested i..."
Ntpd needs to know the local address being used when sending to peers(authentication, which socket to use). That is why it not just reacts to
address information but also redetermines to local addresses (andsockets) are being used for reaching its peers.
The interaction with the routing socket is purposely simple. ntpdjust needs to know that *something* has changed. It will then rescanafter a grace period the interfaces and reevaluate
the interface/local address/socket setup. It does not need to beextremely snappy but it needs to happen.
Dropping that might delay ntpd's detection of changed local addressesfor peers.
For example I fail to see how RTM_LOSING helps that because it won'tchange
how ntpd would configure itself.

Well if I read the comment right I am inclined to differ here:
In in_pcs.c we find:
/*
 * Check for alternatives when higher level complains
 * about service problems.  For now, invalidate cached
 * routing information.  If the route was created dynamically
 * (by a redirect), time to try a default gateway again.
 */
in_losing(struct inpcb *inp)

and the call is in tcp_time.c:
    /*
     * If losing, let the lower level know and try for
     * a better route.  Also, if we backed off this far,
     * our srtt estimate is probably bogus.  Clobber it
     * so we'll take the next rtt measurement as our srtt;
     * move the current srtt into rttvar to keep the current
     * retransmit times until then.
     */

As ntpd acts after a grace period the routing engine may have correctedthis situation and routing may indeed change.ntpd's interactions with peers can take up to 1024s so it is good toattempt in a best effort way to keep the internal

local address/socket state close to the current state.

It is likely though that there have been routing messages likeRTM_CHANGE/ADD/DELETE before that and RTM_LOSING is not providing

additional information at the point.

As NTP doesn't bring interfaces up or down, RFM_IFANNOUNCE is uselessas well.If the interface does vanish, any addresses on it will be reported viaRTM_DELADDR.RTM_IFINFO is also questionable as commentary in the code is that itonly cares about addresses.

Well I read
ntp_io.c
                        /*
                         * we are keen on new and deleted addresses and
                         * if an interface goes up and down or routing
                         * changes
                         */
not as being interested in addresses only.

Also keep in mind that at this point routing messages are processed in aloop and the action here

    timer_interfacetimeout(current_time + UPDATE_GRACE);

just sets the variable for the next interface+local address update run.This is very cheap. The grace periodwill batch multiple routing message together. An explicit routingmessage flush is from my point of viewcode clutter here. as the socket is effectively drained in the loop atthe cost of examining the msg_type and setting

a variable. Not much gained here.

NOTE TO SELF: our kernel doesn't seem to report RTM_CHGADDR anymorelooking at nxr.netbsd.org
I mean, if you want to argue against any of that then I would suggestwhy even bother filtering or looking at overflow at all?Shrink the code - any activity on the routing socket, drain itignoring all error, start the interface update timer.

That would be an option but we should react only on known events. Theremay be one or two events that could be removed fromthe list after examination as other messages can cover for them. Keep inmind the this is a portable code section and thecode tries to be on the fail safe, robust side for the goal ofaddress/routing tracking so adjusting it to a particular implementation

may break on other os implementations.

As for the message: IMHO it does not need to be logged at all(DPRINTF/maybe LOGDEBUG at most) because the overflow should and doesjust trigger ntpd to reevaluate the interface/routing configuration.
This information is not important at all for normal operation as theeffects are correctly mitigated.
Great.
BTW: does the current code revert to (fail safe) periodic interfacescanning if the routing socket is being disabled (happens when anunexpected error code is returned from read(2))?
No.
The socket is non blocking so the only error to ignore here would beEINTR.
Any other errors are due to bad programming IMO.

Could be bad programming, but I prefer the ntpd being forgiving againsthiccups by reverting to periodic scanning when wedisable to routing socket. That is a fail safe strategy and would alsowarrant a log message as it is an unusual event.

Roy

Frank

Follow-Ups:
- Re: Routing socket issue?
  - From: Roy Marples

References:
- Routing socket issue?
  - From: Paul Goyette
- Re: Routing socket issue?
  - From: Roy Marples
- Re: Routing socket issue?
  - From: Paul Goyette
- Re: Routing socket issue?
  - From: Roy Marples
- Re: Routing socket issue?
  - From: Frank Kardel
- Re: Routing socket issue?
  - From: Roy Marples

Prev by Date: Re: Help with libcurses and lynx under NetBSD-9 and -current?
Next by Date: Re: Routing socket issue?
Previous by Thread: Re: Routing socket issue?
Next by Thread: Re: Routing socket issue?
Indexes:

Home | Main Index | Thread Index | Old Index