Subject: Useful info for the Portmasters, was: ntpd - does it work?
To: William O Ferry <woferry@iname.com>
From: Henry B. Hotz <hotz@jpl.nasa.gov>
List: port-macppc
Date: 03/27/2000 10:40:22
At this point you may know more about ntp than I do. I hope you don't mind my
posting this to the list. I think the developers need this information.
As a general observation the ntp folks see more problems with incorrect
configuration of the ntp parameters than with incorrect kernel knowledge of what
the hardware was doing. I think the tickadj program was primarily a hack for
SunOS4 to fix some similar kernel ignorance, and I suspect that NetBSD thinks
it's better to fix the port than support a hack. OTOH there are machines where
the hardware may be actually inconsistent and we may *need* tickadj for those
ports. Does the program compile/work if you get it from UDel?
At 1:56 PM -0800 3/26/00, William O Ferry wrote:
>Henry,
> I've had some time to play around with things the past few days, and got
>something that seems to work, though I'm not sure it's much less of a hack
>than my previous hack.. =)
>
> First off, macppc does indeed hook into NTP, it's all done in the call to
>hardclock(). The way it hooks in it doesn't affect the interrupt rate as my
>hack did, though it does still affect the time. (I ensured that the interrupt
>came every 1/hz, hardclock() adjusts it's view of hz to compensate for the
>actual interrupt rate. I think I still like the former better.. =)
I think the ntp design assumption is that it can't change the hardware interrupts,
just how they get interpreted by users. The Unix version of ntp was originally an
outside-the-kernel process that ran on commercial implementations. I won't say it's
right, just that it is.
> I also re-verified that xntpd doesn't sync either with or without "options
>NTP" set.
>
> Reading through the various NTP .html documents, I've learned that xntpd
>really only expects there to be a small amount of error. Their docs say that
>the error should be +/- 100ppm. Looking at sys/timex.h, while the comment
>says 100ppm it actually sets it at 512ppm. The ppm rate on my PowerBook is
>about +2638ppm. Needless to say, xntpd is actually *unable* to set the
>correct frequency offset because of the 512ppm limit. So even though I could
>toss a value like 2000 in /etc/ntp.drift, xntpd could not set the value any
>larger than 512, so the clock continued to drift at a slightly slower rate.
>It turned out (no real surprise) that while my machine kept bumping the clock
>it never considered itself in sync with the server, so it remained at stratum
>16, and often called the server "insane" in it's listings.. =) When I tried
>pointing it to the 6-or-so Apple NTP servers I could see it exchanging packets
>with all of them, but it always considered them "unreachable" for some reason
>and never adjusted the clock at all, even after ntpdate's to set the clock
>back to a reasonable value.
>
> The documentation suggests that the way to adjust a clock that far off
>is to use the tickadj program. NetBSD however does not provide this program (at
>least I can't find it on any of my systems or in the source tree). I'm not
Some of the memory fog is clearing now. Old SPARCstations had especially bad
clock crystals and you had to calibrate them to get close enough for things to
work. That was the reason for the process I gave you, but I left out the tickadj
step. Basically it was the same problem you are having. Also a model or two
had different crystal rates by maybe 10% or so and different patchlevels of the
OS "fixed the problem" in different ways.
>sure I'd know what to set these values to even if tickadj were available. The
>other 'nit' is that really the clock doesn't vary, in fact I've found mine to
>be rock-solid with a frequency counter. It's just not what Open Firmware
>claims it is (and 66.8MHz is actually what most 66MHz PCI-based buses actually
>run at, not 66.666666MHz but a 14.31818MHz +/- 100ppm source clock in a 14/3
>ratio). So the issue really isn't that the clock is that far off, it's that
>NetBSD/macppc used the wrong value for the clock. I know of no way to get a
>more correct value from the system, however.
>
> Anywho, what I did was to change the ppm limit in sys/timex.h from 512 to
>5120. After this xntpd still wouldn't settle on the correct value even after
>a day or so. But once I threw a 3000 in /etc/ntp.drift, it settled within a
>few hours on 2638 and seems to be running fine now.
This sounds like a good experimental verification of why they have the 512 limit.
The problem as you say is that the starting point is off.
> So it *will* sync, provided I allow it a lot more slop than it usually has.
>And it definitely needs the kick in the right direction, it wouldn't come up
>with the right value on it's own. The big question is what bad effects does
>bumping up the error rate so much cause??? =) Presumably they keep it low
>for a reason?
>
> Any thoughts? Thanks a bunch for the help so far.
It seems to me that we need some model-dependent code that will just set the rates
correctly for the models where OF gives us the wrong value---which it does for both
of your machines |-(. You obviously know what the rate is better than OF and if
you just put it in then you showed that everything else works.
If you don't get a response from the list then do a send-pr.
Signature failed Preliminary Design Review.
Feasibility of a new signature is currently being evaluated. h.b.hotz@jpl.nasa.gov, or hbhotz@oxy.edu