Re: PSA: Clock drift and pkgin

To: Maciej W. Rozycki <macro%orcam.me.uk@localhost>
Subject: Re: PSA: Clock drift and pkgin
From: Johnny Billquist <bqt%softjar.se@localhost>
Date: Mon, 18 Dec 2023 16:09:12 +0100

On 2023-12-18 15:33, Maciej W. Rozycki wrote:

On Fri, 15 Dec 2023, Johnny Billquist wrote:

983136 is pretty close to 1000000. However, without looking at the code,
isn't
that the diagnostics timecounter? Now used for anything really related to
time
keeping, but just for some other information, like sampling the state of
the
cpu and so on?


   It's just a free-running counter, as good as any.  The KA46 has no ICR.


Sortof. ICR and NICR are not required to exist. A machine is allowed to have a
subset implementation of ICCS, only capable of generating an interrupt every
10 ms with no further control. Which is still the normal clock used as a
source for time in the OS, if I'm not completely confused.


  As long as the ICR is used (or no high-resolution timer is available at
all) using timer interrupt as the system clock source is the correct
approach.  The thing is the ICR is synchronous to the timer interrupt, and
moreover it is not a free-running counter as it's reinitialised every time
an interrupt is produced.  So using the counter of interrupt ticks as the
high-order bits of the timekeeping timer is the only way you can produce a
monotonic counter.

Yes. But (and) the thing is - the ICCS is *always* available. And isalways used.But when we don't have the ICR, the value from reading out the clockbecomes tricky. Because we are still using ICCS as the source of clockinterrupts that drive the system wall clock. But then we don't have ICRas a source of information on how long time have passed since the lastclock interrupt. Basically, when we read out time, we call getticks(),and then add the normalized current value of ICR, as current time. So ifICR is 0 all the time, we would basically just have a time that isgetticks() with nothing more, with a resolution of 10ms. But it shouldfor sure be monotonically increasing.

For the KA46 (and *only* the KA46), we are using some other mechanism,which I haven't really dug into, to get some higher precision time whenreading out time. But let's ignore that platform for the moment. We havepeople with various machines, and simulations, which have the timeproblem. And most are not KA46. As far as I see, KA46 is merely the 4000/60.

  The drawback is that if you ever lose even a single timer interrupt, then
you lose track of the wall clock too.


Certainly. Which is why the question of lost interrupts were brought up.

But it is definitely the case that this is how time is tracked.Definitely on VAX. I would think for all other platforms as well, but Ihaven't looked at them.

So in the end, what we have is that for most machines, we're getting a higher
resolution clock based on ICR. We basically have the clock tick, which gives
something at 10 ms steps, and then we add in what ICR is at the moment. For
CPUs that don't have an ICR, the clock will just be at the 10 ms resolution
and that's it.
With the exception of the VAX_BTYP_46, which uses another source, that is.


  And we do want to use such another source where no ICR is available as
10ms resolution is pretty horrible for the purpose of timekeeping.  In
that case the other source is not synchronous to the timer interrupt and
therefore the OS ought not use the timer interrupt as the system clock
source.  Instead it should use the approach I outlined previously, that is
use the high-precision timer as the system clock source and only use the
timer interrupt to keep track of timer overflows.

I should point out that even when we don't have the ICR register, we arerunning with a 10ms precision clock as far as interrupts are concerned.And that is where the system clock source comes from.

Basically, we have an interrupt vector at C0 (IPL 16), which points athardclock. All in arch/vax/vax/intvec.S.

The hardclock routine in turn calls the C function hardclock(), which isin kern/kern_clock.c, which is expected to be called HZ times persecond, and which deals with the wall clock, if it's happening on theprimary cpu (hardclock_ticks).

Note that if anyone calls getticks() in the kernel, they will get thehardclock_ticks value, which is basically just the counter of interruptscalling hardclock().

  Apart from providing correct time (which will not be the case in this
scenario if you try to treat the counter of timer interrupts as the
high-order bits of the timekeeping timer), the advantage of this approach
is that as long as at least one timer interrupt has been handled between
high-precision timer overflows no track loss of the wall clock will happen
(of course we're not supposed to lose timer interrupts anyway, but the
consequences of missing a preemptive context switch are certainly less
severe in a non-RTOS than getting out of sync with time).

Well. Maybe part of the problem is that VAX is actually using a clockinterrupt for counting time. I wasn't even aware that NetBSD could runin tickless mode.There is a lot of things that usually are driven by the ticks. Not justpreemptive context switching.

  The KA46 hardware configuration is analogous to the KN03/3MAX+ machine,
where the source of the timer interrupt is the DS1287 RTC chip and the
high-precision timer is located in the TURBOchannel bridge chip.  This is
handled as "turbochannel_counter" in sys/arch/pmax/pmax/dec_3maxplus.c,
and the KA46 variant ought to work essentially the same.  It is actually
the 3MAX+ machine that David L. Mills used to implement his NTP framework.

The DS1287 should never be a source of any precision time as far as Iknow. It has a resolution of 1s. It's usually used as the calendar chip,from which you set the wall clock on boot, but otherwise never usuallybother with.

Yes, it can generate interrupts as well, with a fairly high frequency,but I can't see a way of reading out any high precision time from it.

But anyway - our timing problems are clearly a case on machines with noDS1287, and with ICR, as well as all other combinations. And even the4000/60 is using getticks() sourced from the ICCS register as thestarting point, and then it just uses some other information to get somemore precision, since the ICR don't exist. (We should probably look atextending that to more machines, because if the 4000/60 don't have this,then it's likely that the same is true for all 4000 machines...)

  NB I disagree that 983136Hz is pretty close to 1000000Hz.  The frequency
difference implies a ~1 second drift per 1 minute, which I find pretty
horrible by any measure.

:-)

I said that a little with tounge in cheek. But also, I'm not entirelysure how the value is used. I can see some computations on the KA46 towork out a high precision time, which are not simple copies of values.So if there is some scaling going on that is included one way or anotheron some values here I'm not sure. But as I said, I'm not even going tosort this one out right now. Keeping it simple, and starting withmachines that don't even deal with that hardware.We still have something seriously wrong on hardware that should not itseems (but I really should check that simh isn't doing the ICR wrong).


  Johnny

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol

Follow-Ups:
- Re: PSA: Clock drift and pkgin
  - From: Maciej W. Rozycki

References:
- PSA: Clock drift and pkgin
  - From: Josh Moyer
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- RE: PSA: Clock drift and pkgin
  - From: Josh Moyer
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- RE: PSA: Clock drift and pkgin
  - From: Josh Moyer
- Re: PSA: Clock drift and pkgin
  - From: Paul Koning
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- Re: PSA: Clock drift and pkgin
  - From: Mouse
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- Re: PSA: Clock drift and pkgin
  - From: Maciej W. Rozycki
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- Re: PSA: Clock drift and pkgin
  - From: Anders Magnusson
- Re: PSA: Clock drift and pkgin
  - From: Maciej W. Rozycki
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- Re: PSA: Clock drift and pkgin
  - From: Maciej W. Rozycki
- Re: PSA: Clock drift and pkgin
  - From: Johnny Billquist
- Re: PSA: Clock drift and pkgin
  - From: Maciej W. Rozycki

Prev by Date: Re: PSA: Clock drift and pkgin
Next by Date: Re: PSA: Clock drift and pkgin
Previous by Thread: Re: PSA: Clock drift and pkgin
Next by Thread: Re: PSA: Clock drift and pkgin
Indexes:

Home | Main Index | Thread Index | Old Index