Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: PSA: Clock drift and pkgin



I would agree with all your observations and comments, Maciej. And it reminded me that I have indeed observed quite some jitter in the past myself.

So it would seem to not be a simple case of lost interrupts, because that should not be possible for the clock to run fast from.

One hour over 100 days is a deviation of about 2%. I think that is possible, but it wouldn't explain the jitter.

As for the high frequency clock in the VAX, I'm not sure it is used at all by NetBSD. But it's been quite a while since I was digging around, and if someone knows it is being used, I'd be happy to hear. But I think we have several problems that should be sorted. And for me, the first one would be to understand how on earth we can be spending so much time in system mode when we do anything.

Since I'm mostly using 2.11BSD as a reference these days (as I poke much more in there), it's a common system I use for comparison. I know it's not such a fair comparison in most ways, but when I try to load that down with a whole bunch of work, I almost never can get system to go over 20%, while on NetBSD/vax pretty much current, I seldom (if ever) see it go below 50%. And the VAX is a much faster machine here, so I can't understand how it can spend the majority of time down in system when running something like compiling.

  Johnny


On 2023-12-14 00:06, Maciej W. Rozycki wrote:
On Wed, 13 Dec 2023, Johnny Billquist wrote:

When running 1.4T on my VAX emulator, I've noticed clock drift.  One of
my "I want to do someday" things is to figure out what's going on with
that.  (And that's 1.4T, too, not "these days".)

Do you know if that was drift affected by load, or just simple drift?
Because if you run a VAX without something like ntp, you will *always* have
some drift. The clock interrupt on the VAX isn't very precise. It's running at
100 Hz, but the error is commonly a couple of percent, meaning over a day you
can easily have a drift of a minute or two.

  Ntpd is able to compensate for systematic clock drift, but not for random
one, which makes ntpd eventually quit.  With lizzie, which is KA46, I have
always observed ntpd giving up and then the clock drifting away.  This is
with:

lizzie$ uname -a
NetBSD lizzie 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 2020  mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/vax/compile/GENERIC vax
lizzie$

and I now have:

lizzie$ uptime
10:53PM  up 107 days, 10:17, 2 users, load averages: 0.02, 0.12, 0.07
lizzie$ date
Wed Dec 13 22:53:45 UTC 2023
lizzie$

while the correct time is:

angie$ date
Wed 13 Dec 21:48:49 UTC 2023
angie$

so the clock has been running fast, not the usual sign of lost interrupts.
Now lizzie has been sitting mostly idle since I last ran GCC regression
testing in mid Oct, except for occasional malicious network connection
attempts which do take some processing power of the venerable machine.

  I can resynchronise the clock and rerun ntpd and see if the daemon
survives while the machine is idle say until tomorrow.

  What is notable however it is a particularly high clock jitter reported
with lizzie, which is unlike with my various other systems, including
similarly old and slow ones such as a KN03 DECstation MIPS machine, a
486DX2 PC machine or a dual P5-MMX PC machine.

  So for lizzie the jitter jumps between ~50 and ~150 while with the P5-MMX
it is below 10, with the 486DX2 it is below 0.5 and with the KN03 it is
below 0.1 even.

  High jitter may legitimately happen with systems that rely solely on the
timer interrupt for timekeeping and consequently have a very coarse time
resolution, because the accuracy of the time returned by system facilities
such as gettimeofday(2) or ntp_gettime(2) will then depend on how much
time has passed since the last timer interrupt tick when the call is made.

  As I recall the KA46 does have a high-precision timer though, so it seems
like there is something fishy going on here: either the calculation of the
fractional part of timer interrupt ticks isn't right or the latency of the
clock retrieval system facilities is highly variable for some reason.

   Maciej

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol


Home | Main Index | Thread Index | Old Index