Subject: Re: Strange numbers from gettimeofday(2)
To: None <port-xen@NetBSD.org>
From: Jed Davis <jdev@panix.com>
List: port-xen
Date: 01/13/2006 23:21:58
Jed Davis <jdev@panix.com> writes:

> Sometimes, gettimeofday(2) will return large negative numbers for the
> tv_usec field; this has been happening spontaneously, but I've been
> able to reproduce it by either pausing or ddb-breaking a domU for a
> few seconds (the exact interval varies on different hardware).

It's worse than that, actually.  The problem AFAICT is in the MI
cc_microtime (in kern_microtime.c):

        t.tv_usec += (cc * ci->ci_cc_ms_delta) / ci->ci_cc_denom;
	while (t.tv_usec >= 1000000) {
		t.tv_usec -= 1000000;
		t.tv_sec++;
	}

It looks like the RHS of the += winds up as complete garbage, and
sometimes the mutiply overflows and winds up making tv_usec negative.
When it's negative, that gets passed back to userspace, and e.g. BIND
will notice this and complain; when it's positive, it gets folded into
the tv_sec, and then cron sees the clock suddenly bounce back and
forth by up to ~35 minutes (1<<31 us) and does undesirable things
(like running three copies of a script at once that step on each
others' temp files).

One problem with cc_microtime is that is assumes that each CPU NetBSD
knows about is an actual physical CPU, and thus a cycle counter
timestamp taken some number of ticks ago is still valid.

That might not be a problem with HT; I don't know how that works.

Oh, of course: It will *definitely* break if the domain is paused or
in ddb for more than a second, because xen_timer_handler will call
cc_microset multiple times, passing it time(9) each time, which is
dutifully being advanced by the stacked-up calls to hardclock(9).  So
the second time cc_microset sees that more or less a second has
passed, but very few CPU cycles have; it (if I read it correctly) then
estimates a very low CPU speed, and for the next second of real time
cc_microtime gets absurdly large values which overflow and cause
negativeness.

Now, &cc_microtime gets set as microtime_func because a TSC is
detected in arch/xen/i386/identcpu.c, though I don't know that Xen
will work on anything without a TSC; my first thought was that it
might be desirable to disable that and always use xen_microtime.

However: xen_microtime would need to call get_tsc_offset_ns() to be of
any use, and really shouldn't need to call get_time_values_from_xen()
as long as the timer event handler does, since (according to the docs)
a timer event will be asserted whenever a domain becomes scheduled.

Problem 1: It's still completely ignoring time(9), which is very wrong
AIUI; and resettodr() is a no-op, which makes it worse.

Problem 2: I tried this, and the shadow_tv was 42 seconds fast and
drifting slowly but noticeably forward; the dom0 was running ntpdate
from cron and was on time.  (Hm... the DOM0_SETTIME is called only
when resettodr() is, so if the drift is small enough for ntpdate to
always use adjtime(2), it won't ever correct Xen's time?)

Problem 3: yamt mentioned on ICB that the tsc_timestamp in the
shared_info page might not be right for the current CPU; I haven't
checked on what Xen actually does here yet, but it seems to me that
that would be a bug, as the timestamp is useless if it's from
undefined physical CPU that we can't access.

-- 
(let ((C call-with-current-continuation)) (apply (lambda (x y) (x y)) (map
((lambda (r) ((C C) (lambda (s) (r (lambda l (apply (s s) l))))))  (lambda
(f) (lambda (l) (if (null? l) C (lambda (k) (display (car l)) ((f (cdr l))
(C k)))))))    '((#\J #\d #\D #\v #\s) (#\e #\space #\a #\i #\newline)))))