current-users: Re: Wow. (Was benchmarks...)

Subject: Re: Wow. (Was benchmarks...)
To: George Michaelson <G.Michaelson@cc.uq.oz.au>
From: Dennis Ferguson <dennis@mci.net>
List: current-users
Date: 12/06/1994 09:02:43
George,

>   You've never justified *why* it's a good idea to patch it at runtime.
>   Is there some reason you can't change it in param.c?  If the default
>   is fundamentally wrong, then perhaps it should be changed globally.
>
> My understanding is that NTP V3 believes it can analyze the clockdrift
> across very long baselines, and derive an "optimal" value for this which
> can then be runtime stuffed back in at reboot or on demand. 

This isn't quite right.  tickadj does two things in a BSD kernel that
ntp cares about:

(1) It sets the slew rate at which adjustments are done (this rate is
    actually tickadj/tick).  ntp needs to know this rate, and hence
    determine tick and tickadj, so it can make sure it doesn't try to
    do an adjustment which is so large that it won't be completed by the
    time ntp wants to make its next adjustment.

(2) While the BSD adjtime() allows you to specify an adjustment with
    microsecond precision, it then gratuitously and obnoxiously truncates
    the adjustment to an even multiple of tickadj microseconds.  Left
    uncorrected the accumulated error from this can make ntp seriously
    dysfunctional.  ntp hence wants to know the value of tickadj so it
    can arrange to never make an adjustment which isn't an even multiple
    of tickadj microseconds (it does this by accumulating undone adjustments
    until it has enough to give it a bigger bump).

Thus ntp doesn't actually care all that much about the particular value
of tickadj as long as it can determine what it is.  It does care some,
since tickadj provides sort of a best-case accuracy for the clock it is
better if the value is smaller than the best precision you might expect
from ntp (and small enough that the probability of seeing backward-moving
time from successive gettimeofday() calls is low), but not so small that
relatively large slews won't complete in ntp's adjustment interval.  If the
default value is between sort of 5 us and 50 us, for a 10,000 us tick, it'll
probably do about as well as any other value.  The tweaker program exists
only for the benefit of people whose software vendors set tickadj to really
silly values by default (SunOS used to default it to 1000).  Note that the
tweaker program will also let you set a new value of tick, a facility which
exists soley for people whose hardware vendors sold them machines with
very broken clocks.

Of course, the fact that you need to grovel around looking for the values
of kernel variables, and be so intimately aware of the implementation details,
to make effective use of adjtime() in the first place should suggest that
this is a system call which, while well conceived, is really badly designed
and implemented.  When I cared more about this I added a replacement to an
IBM RT kernel I was intimate with which

(a) let you explicitly set the slew rate for each adjustment

(b) did the full adjustment without truncating bits which were specified

(c) let you specify the magnitude of the adjustment separately from the
    direction (forward or back) so you didn't have to read kernel code
    to figure out what a negative-valued stuct timeval should look like

and the whole sad issue went away.  tickadj is not an architectural
feature, it is a bug.

Dennis Ferguson