tech-kern: cc_microtime() bug makes gettimeofday() go backwards?

Subject: cc_microtime() bug makes gettimeofday() go backwards?
To: None <tech-kern@netbsd.org>
From: Nathan J. Williams <nathanw@wasabisystems.com>
List: tech-kern
Date: 05/06/2004 19:03:29
In some timekeeping investigation on my workstation, I found that I
suffer from the "gettimeofday() goes backwards" problem, as described
in PR kern/14058, with a simple test program there. After some poking
around in kern_microtime.c, the following code at the end of
cc_microtime() caught my eye:

        /*
         * Ordinarily, the current clock time is guaranteed to be later
         * by at least one microsecond than the last time the clock was
         * read.  However, this rule applies only if the current time is
         * within one second of the last time.  Otherwise, the clock will
         * (shudder) be set backward.  The clock adjustment daemon or
         * human equivalent is presumed to be correctly implemented and
         * to set the clock backward only upon unavoidable crisis.
         */

        simple_lock(&microtime_slock);
        sec = lasttime.tv_sec - t.tv_sec;
        usec = lasttime.tv_usec - t.tv_usec;
        if (usec < 0) {
                usec += 1000000;
                sec--;
        }
        if (sec == 0) {
                t.tv_usec += usec + 1;
                if (t.tv_usec >= 1000000) {
                        t.tv_usec -= 1000000;
                        t.tv_sec++;
                }
        }
        lasttime = t;
        simple_unlock(&microtime_slock);

This appears to be trying to prevent t.tv_usec from going backwards,
short-term, by correcting it relative to the last time read
out. However, on modern fast machines, two calls to microtime can
occur within one microsecond. When this happens, the "sec == 0" test
fires, and t.tv_usec is improperly advanced beyond the current
time. On subsequent calls, t then appear to be in the past compared to
lasttime, and so t.tv_usec is advanced further and further by the + 1
term. This means that no more than 999999 calls to gettimeofday() will
return the same tv_sec value, even when the call rate is faster than
that (my Athlon XP 2000 can sustain about 1,600,000 gettimeofday()
calls per second).

The fix I have is to change:

        if (sec == 0) {

to

        if (sec == 0 && usec > 0) {

which avoids doing any "corrections" in fast, repeated calls. This
seems to have fixed the problem on my alpha box (UP1500, 800MHz
21264), which previously would have a negative gettimeofday() result
from the test program about every 20 seconds, and on my dell PC
(2.4GHz Xeon). The Athlon box now trips the error about every 7
seconds instead of every 2 seconds... which is an improvement of
sorts, but something else is still wrong.

Comments? Timekeeping hasn't really been my specialty.

        - Nathan