Subject: Re: Strange numbers from gettimeofday(2)
To: Jed Davis <jdev@panix.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 01/13/2006 20:41:22
On Fri, Jan 13, 2006 at 02:40:03AM -0500, Jed Davis wrote:
> Sometimes, gettimeofday(2) will return large negative numbers for the
> tv_usec field; this has been happening spontaneously, but I've been
> able to reproduce it by either pausing or ddb-breaking a domU for a
> few seconds (the exact interval varies on different hardware).
> 
> As one might expect, things like cron(8) aren't at their best when the
> clock suddenly moves back and forth by two billion microseconds ~=
> half an hour, and I've seen this confusion of time in syslog
> timestamps.
> 
> The system where it's been causing problems by occurring without
> suspending the system is hyperthreaded, and others where it's not
> aren't; this may or may not be coincidence.
> 
> Fixing the integer overflow in get_tsc_offset_ns() neither fixes this
> nor seems to make it worse.
> 
> I'm attempting to look into this, and welcome suggestions.

I've noticed this too, and didn't find the cause. I've seen this on
a dual-CPU PIII system (mrtg isn't happy either with the time going
backward). I didn't notice this issue on a single-CPU PIII also running mrtg.

Hum, did you notice the issue in dom0, or only in domU not running on the
same CPU ?

I saw that the time interface in Xen3 is changed, they now use a version
field which is incremented both before and after the time update.
However I can't see how the current code could lead to a race condition.
I even checked the assembly dump of get_time_values_from_xen(), it
looks right.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--