Subject: Re: serial port silo overflow repair
To: Simon Burge <simonb@telstra.com.au>
From: Erik E. Fair <fair@clock.org>
List: tech-kern
Date: 07/29/1997 03:50:07
If all your systems are running the same OS, and have a comparable load, then
I'd say that the crystal quality in each machine is what you're measuring.

System clock drift as measured by NTP is due to a combination of drift in the
hardware clock, and OS interrupt latency (and, to some degree, OS scheduling
variations). Ideally, the OS adds no latency (or a consistent latency with no
variation due to load), and thus are you left with whatever accuracy the
hardware clock can give you. However, when you run two different OS's on a
given piece of hardware, and get different drift numbers, one of those OS's
is suspect.

In addition, it helps to have the machines in an environment where the
temperature doesn't vary much. Despite the fact that I don't have that for my
machines, I'd say it's not the deciding factor - not when the drift between
machines in the same room differ by an order of magnitude.

NTP can embarass the hardware designer for choosing a cheap crystal clock,
the systems programmer for being sloppy with "spl", and the operations manager
for letting his A/C system allow wide swings in machine room temperature. It's
pretty cool software.

All this stuff aside, I still think working on improving NetBSD's interrupt
latency in both the MD and MI code is a good idea, for this, for PPP over
serial lines, and everything else. The point of this thread was to find tools
and techniques for identifying specific parts of the kernel that need
improvement in this regard.

	Erik E. Fair    fair@clock.org