Subject: Re: Proposed changes to cc_microset().
To: Frederick Bruckman <fredb@immanent.net>
From: Simon Burge <simonb@wasabisystems.com>
List: tech-kern
Date: 07/06/2004 09:38:31
Frederick Bruckman wrote:

> Please let me draw your attention to the following:
> 
>   ftp://ftp.netbsd.org/pub/NetBSD/misc/fredb/cc_microset.diff
> 
> The main idea, is that, though there will be unpredictble and
> unavoidable delays in executing cc_microtime() and cc_microset(), the
> latency in cc_microset() needn't be a factor, if we only average out
> over a longer interval. Additionally, even though the comments in
> cc_micotime() promise to interpolate between ticks, it's really
> interpolating between *seconds*, which, I believe, loses accuracy.
> 
> So, the point of the patch, is, to call cc_microset() more frequently,
> to update the base time for cc_microtime(), while actually doing the
> frequency calibration less frequently (1/10 second and 10 seconds,
> respectively).
> 
> I chose 1/10 second for the shorter interval, because all the port's
> "Hz" are divisible by 10.

I haven't looked at the math close enough yet, but does will a port with
hz != 10 still work ok?  There's ports that use hz = 1024, 512, 256, 64
and 25, and some port let you define your own values.  In particular,
all the ports that use the "mcclock" device use 256 Hz as a default.

It looks like if you want to keep the 10 second long period you just
need to update the calculation of ci_cc_microset_count, and adjust the
"one tenth of a second" and "hundred times" comments.

> The 10 seconds is arbitrary, but it seems adequate on at least one
> i386 box to let the calculated frequency vary smoothly. This is
> important, because we're using the results from one period to make the
> cc_microtime() calculation in the next, and it makes no sense to do
> that if the numbers vary randomly from one period to the next.
> 
> The burden of calling cc_microset() more frequently is offset, to some
> extent, by the fact that it does less on all but 1 in 100 calls. Is
> that acceptable? Note that the slowest ports don't even have a cycle
> counter, and so will never call cc_microset().

I'm guessing a profiled kernel wouldn't show any useful info since we're
analysing something at clock interrupt time.  It should be easy to rig
up some rudimentry timing info to see how many clock cycles that the
1-in-100 and 99-in-100 calls actually take.

For the SMP case, your new code results in ten times the number of microset
IPIs.  I really have no idea what sort of performance impact that has.

> A bonus is that we can monitor (and reject) large burps more easily.
> The current code only rejects misses by 0.5 seconds or larger, but the
> "ntpd" daemon typically steps time by as little as 0.128 seconds. A
> put the tighter constraint in after seeing this in the log:
> 
>  cc_info: CPU 0: computed frequency = 9235698347/-1054033355 = -8.-762244 Mz
> 
> Although it was at boot-up and was corrected 2 seconds later, it can't
> be be good.
> 
> The patch is for i386 only, but it's easy to see what needs to be
> changed in ".../cpu.h" on the other ports.
> 
> Does anyone see a problem with the patch, or with the basic idea?

Should ci_cc_microset_cc_lo/hi be uint32_t instead of int64_t?

I'm not understanding the relationship between ci_cc_microset_cc_lo/hi
and ci_cc_cc either.  We only set ci_cc_microset_cc_lo once every ten
seconds, and possibly increment ci_cc_microset_cc_hi every now and then
(six or seven times over 10 seconds on a 3GHz box?), and use them in the
following calculation:

                denom = (ci->ci_cc_microset_cc_hi * 0x100000000LL) +
                    (cc - ci->ci_cc_microset_cc_lo);

I'm missing something there...

I also wonder if the two expressions in:

        if (delta < (100000 - (1000000 / hz) / 2) || 
             delta > (100000 + (1000000 / hz) / 2)) {

might be better off being pre-calculated or not?  Divides can take a
long time on some architectures.  On the other hand, divides might
be quicker than a memory load on others (though we need to load "hz"
anyway)...

I just occurs to me that as we pass 4GHz boxes the current "once per
second" thing isn't going to be good enough, is it?  Certainly from just
this alone, we need to make some changes soon and from what I understand
your changes should handle this.  That said when we hit 40GHz boxes
(only another factor of ten, and it wasn't that long ago that 400MHz
boxes were fast) when we will need to set ci_cc_microset_count so that
we're called even more than every 1/10th of a second...

> Another thing -- after building with CC_VERBOSE, "gdb" can't actually
> set cc_verbose to "0" to turn it off. ("gdb" dumps core.) Can anyone
> tell me what I'm doing wrong?

No idea off hand, but you shouldn't need a "volatile" in the declaration
of cc_verbose.  Does it help removing that?

I'd also change the name of that cpp flag to CC_DEBUG or DEBUG_CC
(something with "DEBUG" in it) to keep in line with other debug flag
names.

Hopefully all the above is not too muddled!  Now I'll go and grab a
coffee and maybe look at it all again.  :-)

Simon.
--
Simon Burge                            <simonb@wasabisystems.com>
NetBSD Support and Service:         http://www.wasabisystems.com/