Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Global page into user processes



What you need to have access to is the timehands&timecounter related data structures (need to be modified a bit to keep all data pieces within the mapped range).
The issues you face would be:
     - a (consistent) fast way to do the get_timecount() procedure
this could be a (fast&special) system call calling the actual thing or a user level access (like TSC, but on MP you might want to apply cpu local offsets for skew compensation requiring additional data from the kernel) - the scheme should work with all timecounters (maybe by falling back the to real thing)

As for the speed, is very much depends on the underlying timecounter. TSC is fast and on a AMD 1100T@3.3GHz ntpd measured a round trip time for gettimeofday() of ~130ns. When you use other timecounters (hpet/ACPI PM) read times on the
same system can easily go to 5 usec because of the bus interaction.
I guess we would get the most speed-up out of being able to do the get_timecount() at user level and calculating the time scales with the help of the kernel updated timehands data. If we cannot get the timecounter value at user level I am not sure that we make big progress for the case where we need to enter the kernel via normal system call means. Has anyone measured the relations and conditions between
syscall/gettimecount/timescale-calculations?

Note:
Even if the read time is several usec you can still create a good time scale. A driver I wrote for a PCIe radio clock board needs ~3500 +/- 100ns to read the 100ns precise GPS derived time stamp from the device. The calibration issue here is when within the 3.5usec section is the timestamp taken in the radio clock board. A modified PPS interface allowed (when temperature at oscillator level was stable) synchronization in the +-100ns range. This
makes a reasonably good time transfer for time of day applications.

Frank

On 01/18/12 21:20, Matthew Mondor wrote:
On Wed, 18 Jan 2012 10:54:43 +0100
Joerg Sonnenberger<joerg%britannica.bec.de@localhost>  wrote:

On Wed, Jan 18, 2012 at 01:02:22AM -0500, Matthew Mondor wrote:
Since that doesn't already exist but appears at first glance rather
simple to implement, is there a particular reason that makes this
undesirable?
On platforms with SMP support, it often requires per-CPU mappings. This
makes context switching more expensive and/or costs memory. That doesn't
apply for M68K though.
I noticed that the KERN_HARDCLOCK_TICKS sysctl(3) is still quite faster
than gettimeofday(2) (although it only exports a single int, the
hardclock(9)):

# 1 million calls on a P4
behemoth$ time ./sysctl
         0.89 real         0.20 user         0.66 sys
behemoth$ time ./gettimeofday
         1.54 real         0.16 user         1.30 sys

Would it make more sense to export a higher resolution timestamp via
sysctl and have libc clock calls use that?  Any idea if sysctl also
suffers from the SMP related overhead already (and thus gives similar
performance to what a shared page would achieve)?

A part that's currently unclear to me is the impact of actively
updating a timeval regularily from the kernel, as currently
gettimeofday(2) results in on-demand calls microtime() which calls
bintime() which in turn calls binuptime(), which then gets the
information from timehands.  I guess that timehands is what would have
to get exported...

Thanks,



Home | Main Index | Thread Index | Old Index