The issues you face would be: - a (consistent) fast way to do the get_timecount() procedurethis could be a (fast&special) system call calling the actual thing or a user level access (like TSC, but on MP you might want to apply cpu local offsets for skew compensation requiring additional data from the kernel) - the scheme should work with all timecounters (maybe by falling back the to real thing)
As for the speed, is very much depends on the underlying timecounter. TSC is fast and on a AMD 1100T@3.3GHz ntpd measured a round trip time for gettimeofday() of ~130ns. When you use other timecounters (hpet/ACPI PM) read times on the
same system can easily go to 5 usec because of the bus interaction.I guess we would get the most speed-up out of being able to do the get_timecount() at user level and calculating the time scales with the help of the kernel updated timehands data. If we cannot get the timecounter value at user level I am not sure that we make big progress for the case where we need to enter the kernel via normal system call means. Has anyone measured the relations and conditions between
syscall/gettimecount/timescale-calculations? Note:Even if the read time is several usec you can still create a good time scale. A driver I wrote for a PCIe radio clock board needs ~3500 +/- 100ns to read the 100ns precise GPS derived time stamp from the device. The calibration issue here is when within the 3.5usec section is the timestamp taken in the radio clock board. A modified PPS interface allowed (when temperature at oscillator level was stable) synchronization in the +-100ns range. This
makes a reasonably good time transfer for time of day applications. Frank On 01/18/12 21:20, Matthew Mondor wrote:
On Wed, 18 Jan 2012 10:54:43 +0100 Joerg Sonnenberger<joerg%britannica.bec.de@localhost> wrote:On Wed, Jan 18, 2012 at 01:02:22AM -0500, Matthew Mondor wrote:Since that doesn't already exist but appears at first glance rather simple to implement, is there a particular reason that makes this undesirable?On platforms with SMP support, it often requires per-CPU mappings. This makes context switching more expensive and/or costs memory. That doesn't apply for M68K though.I noticed that the KERN_HARDCLOCK_TICKS sysctl(3) is still quite faster than gettimeofday(2) (although it only exports a single int, the hardclock(9)): # 1 million calls on a P4 behemoth$ time ./sysctl 0.89 real 0.20 user 0.66 sys behemoth$ time ./gettimeofday 1.54 real 0.16 user 1.30 sys Would it make more sense to export a higher resolution timestamp via sysctl and have libc clock calls use that? Any idea if sysctl also suffers from the SMP related overhead already (and thus gives similar performance to what a shared page would achieve)? A part that's currently unclear to me is the impact of actively updating a timeval regularily from the kernel, as currently gettimeofday(2) results in on-demand calls microtime() which calls bintime() which in turn calls binuptime(), which then gets the information from timehands. I guess that timehands is what would have to get exported... Thanks,