Re: DECstation 5000/200 timekeeping

To: Jonathan Stone <kiwi_jonathan%yahoo.com@localhost>
Subject: Re: DECstation 5000/200 timekeeping
From: "Maciej W. Rozycki" <macro%orcam.me.uk@localhost>
Date: Mon, 1 Nov 2021 01:49:07 +0000 (GMT)

On Mon, 1 Nov 2021, Jonathan Stone wrote:

> > Hmm, as it happens we have a timekeeping problem with the VAX port too,
> > cf. PR port-vax/56383, so I wonder if this is something generic, perhap
> > specific to slower systems which can miss interrupts more easily if we're
> > not careful enough in making them delivered in a timely manner.
> 
> That doesn't sound plausible. I used to actively develop NTP with Dave 
> Mills and Harlan Stenn, using a DECstation 500/150.
> I added cycle-counter support for that, but NTP was quite usable even 
> without. And I regularly used NetBSD on DECstations for networking 
> throughput tests. Also, that 5000/150 has never seen any "sd" SCSI disks 
> since Ralph Campbell's "rz" driver was replaced with the MI "sd" driver. 
> (fails to find a root filesystem, "?" lists nothing)

 Umm, I didn't imply it had something to do with NTP timekeeping code; 
rather something else in the kernel may be preventing NTP code from doing 
its stuff by starving timer interrupts.

> So I conclude there's some pmax-specific breakage. There may well be sme 
> MI breakage, but the pmax behavior is clearly a regression.
> (Mouse's report that pre-1.5 NetBSD has _much_ better timekeeping (their 
> emphasis) seems like good evidence.
> 
> I'm not saying there's a single root-cause for both the unacceptable 
> timekeeping, and the SCSI-disk breakage. It's more that I can't test or 
> repro Mouse's report on the machine where I used to build NetBSD 
> releases on, because that newer version of NetBSD never sees any disks 
> on that machine.

 So say a SCSI driver running at a heightened IPL for a prolonged time 
could potentially interfere with the timer interrupt and consequently NTP 
timekeeping, especially with machines that have no high-resolution timer 
with a suitable span (like the IOASIC free-running counter implemented 
with the 5000/240) that could compensate for missed interrupts.  I think 
serial ports running at a low baud rate are suspect too for interrupt 
starving in some scenarios (polling for shift register status).

 And those things have surely evolved over the years, so even if things 
worked perfectly say 20 years ago, something could have regressed since 
and not have been noticed if people who made such a change only use fast 
computers.

 And the 5000/150 has a high-resolution timer with the CP0 in the CPU (not 
in the IOASIC though), which has a 32-bit span and is clocked at half the 
CPU clock rate, so it only overflows after ~172 seconds, which is more 
than enough to compensate for a missed tick or a couple (of course the 
handling for such a scenario has to be explicitly implemented by the OS).

  Maciej

Follow-Ups:
- Re: DECstation 5000/200 timekeeping
  - From: Mouse

References:
- DECstation 5000/200 timekeeping
  - From: Mouse
- Re: DECstation 5000/200 timekeeping
  - From: Michael
- Re: DECstation 5000/200 timekeeping
  - From: Mouse
- Re: DECstation 5000/200 timekeeping
  - From: Jonathan Stone
- Re: DECstation 5000/200 timekeeping
  - From: Maciej W. Rozycki
- Re: DECstation 5000/200 timekeeping
  - From: Jonathan Stone

Prev by Date: Re: DECstation 5000/200 timekeeping
Next by Date: Re: DECstation 5000/200 timekeeping
Previous by Thread: Re: DECstation 5000/200 timekeeping
Next by Thread: Re: DECstation 5000/200 timekeeping
Indexes:

Home | Main Index | Thread Index | Old Index