Subject: Timekeeping in NetBSD -or- getting ready to split the microsecond
To: None <tech-kern@netbsd.org>
From: Frank Kardel <kardel@acm.org>
List: tech-kern
Date: 10/25/2005 02:16:37
Why ?

Until the day I got my hands on a SOEKRIS 4801 I was pretty content
with NetBSD's time implementation. It was sufficient to support all my
tests I need to do when doing NTP development work.

Installing NetBSD on the SOEKRIS was not a big problem - but the first
ping, rather the claimed response times of 55ms, irritated me.
I found out that the HW was supposed to be ok. I also found out that others
had had trouble too, but patches were available for Linux and FreeBSD.
Well, I didn't want to give up on NetBSD, especially it now had PF and
VERIEXEC - perfect for drop behind the desk micro network servers....

That's how it started...

Current state of affairs:
Looking into the time keeping code I found many indications that certain 
hardware
platforms tend to give headaches when it comes to solid time-keeping. 
There are
broken latches, power saving features and other horrors. I knew others 
didn't have
that many problems (any more?) - so I looked around and found the very 
good work of a long
time (Free)BSD committer Poul-Henning Kamp. He is also active within the 
NTP project.
He did an architectural very sound implementation of a clean, scalable 
and very precise
time keeping code. There is also an article 
http://phk.freebsd.dk/pubs/timecounter.pdf
about his implementation. It is worth reading.

Well, then nothing happened for a while... (being busy with work and NTP 
development)

Last Saturday: *TIMEOUT* - SOEKRIS was STILL sitting on the desk and 
bothering me:

I ported the time counters into my -current tree. This turned out to be 
quicker then
I thought and early in the morning:

    NetBSD could now keep time time on the SOEKRIS - this is what I 
wanted...

Where are we now?
There is:
    - a dual mode (old NetBSD code and Timecounters) implementation
    - a timecounter enabled i386 architecture (not XEN though)
    - an update of the NTP kernel API to support nano second 
synchronisation and TAI information
    - 27 Mhz counter support for Geode
    - an impression what still needs to be improved
    - com PPS support for time counters
    - z3530tty PPS support for time counters
just to name the highlights.

What are the immediate benefits?
    - MP timekeeping now works !
    - SOEKRIS now works !
    - true nano seconds
    - new ntp kernel API

Examples:

At short sniplet the the loop statistics from the SOEKRIS:
JDN   Second    Offset(sec) Freq.  Jitter      Stability Poll 
53667 74409.037 0.000000884 10.649 0.000000526 0.000949 4
53667 74426.037 0.000000790 10.649 0.000000555 0.000888 4
53667 74444.037 0.000000622 10.649 0.000000602 0.000830 4
53667 74459.037 0.000000574 10.649 0.000000431 0.000777 4
53667 74477.037 0.000000484 10.649 0.000001627 0.000727 4
53667 74493.037 0.000000369 10.649 0.000000706 0.000680 4
53667 74511.037 0.000000403 10.649 0.000000724 0.000636 4
53667 74526.037 0.000000392 10.649 0.000000273 0.000595 4
53667 74542.038 0.000000526 10.649 0.000000628 0.000556 4
53667 74559.037 0.000000424 10.649 0.000000505 0.000520 4
53667 74574.037 0.000000588 10.649 0.000000733 0.000487 4
53667 74592.037 0.000000303 10.648 0.000000317 0.000485 4
53667 74608.038 0.000000174 10.648 0.000000529 0.000454 4
53667 74624.037 0.000000317 10.648 0.000000525 0.000424 4
53667 74639.039 0.000000091 10.648 0.000000595 0.000397 4
53667 74654.038 0.000000478 10.648 0.000000581 0.000371 4
53667 74670.040 0.000000484 10.648 0.000000766 0.000347 4
53667 74687.039 0.000000329 10.648 0.000000442 0.000325 4
53667 74702.038 0.000000228 10.648 0.000000426 0.000304 4
53667 74718.038 0.000000035 10.648 0.000001067 0.000284 4
53667 74734.038 0.000000352 10.648 0.000000591 0.000266 4
53667 74749.038 0.000000231 10.648 0.000001458 0.000249 4
53667 74767.039 0.000000245 10.648 0.000000840 0.000233 4

New ntptime gives:
SOEKRIS: {9} ntptime
ntp_gettime() returns code 0 (OK)
  time c707d947.69c8a000  Mon, Oct 24 2005 22:05:59.413, (.385240610),
  maximum error 978 us, estimated error 1 us, TAI offset 32
ntp_adjtime() returns code 0 (OK)
  modes 0x0 (),
  offset 1.283 us, frequency 10.724 ppm, interval 256 s,
  maximum error 978 us, estimated error 1 us,
  status 0x2107 (PLL,PPSFREQ,PPSTIME,PPSSIGNAL,NANO),
  time constant 4, precision 0.001 us, tolerance 496 ppm,
  pps frequency 10.724 ppm, stability 0.006 ppm, jitter 0.771 us,
  intervals 429, jitter exceeded 2328, stability exceeded 0, errors 1.

ntpq -p:
SOEKRIS: {10} ntpq -p
     remote           refid      st t when poll reach   delay   offset  
jitter
==============================================================================
+pip.acrys.com   .PPS.            1 u   11   16  377    0.605   -0.032   
0.078
*GENERIC(0)      .GPS.            0 l    7   16  377    0.000    0.001   
0.004

New sysctl variables:
kern.timecounter.current = Geode
kern.timecounter.counter.i8254 = 1193182 Hz, quality=0
kern.timecounter.counter.Geode = 27000000 Hz, quality=1000
kern.timecounter.counter.TSC = 266652504 Hz, quality=800
kern.timecounter.timestepwarnings = 0
kern.timecounter.nbinuptime = 28334286
kern.timecounter.nnanouptime = 0
kern.timecounter.nmicrouptime = 160802
kern.timecounter.nbintime = 28173490
kern.timecounter.nnanotime = 2624259
kern.timecounter.nmicrotime = 25549235
kern.timecounter.ngetbinuptime = 0
kern.timecounter.ngetnanouptime = 10931302
kern.timecounter.ngetmicrouptime = 10484735
kern.timecounter.ngetbintime = 0
kern.timecounter.ngetnanotime = 0
kern.timecounter.ngetmicrotime = 14463295
kern.timecounter.nsetclock = 2
kern.timecounter.tsc_ok = 0

Obligatiory sniplets from dmesg:
NetBSD 3.99.10 (SOEKTC) #6: Sun Oct 23 18:58:14 MEST 2005
        
kardel@pip.kardel.name:/fs/IC35L180AVV207-1-n/IC35L120AVV207-0-e/src/NetBSD/net
bsd-tc/sys/arch/i386/compile/obj.i386/SOEKTC
total memory = 127 MB
avail memory = 121 MB
Timecounter "i8254" frequency 1193182 Hz quality 0
BIOS32 rev. 0 found at 0xf7840
PCI BIOS rev. 2.0 found at 0xf7861
pcibios: config mechanism [1][x], special cycles [x][x], last bus 0
pcibios_get_intr_routing: function not supported
No PCI IRQ Routing information available.
mainbus0 (root)
cpu0 at mainbus0: (uniprocessor)
cpu0: National Semiconductor Geode GX1 (586-class), 266.66 MHz, id 0x540
cpu0: features 808131<FPU,TSC,MSR,CX8>
cpu0: features 808131<CMOV,MMX>
cpu0: "Geode(TM) Integrated Processor by National Semi"
...
geodecntr0 at pci0 dev 18 function 5
geodecntr0: high resolution counter
Timecounter "Geode" frequency 27000000 Hz quality 1000
...
nsclpcsio0 at isa0 port 0x2e-0x2f: NSC PC87366 rev. 9
nsclpcsio0: GPIO at 0x6600
nsclpcsio0: TMS at 0x6640
gpio1 at nsclpcsio0: 29 pins
npx0 at isa0 port 0xf0-0xff: using exception 16
pcppi0: attached to attimer0
Timecounters tick every 10.000 msec
Timecounter "TSC" frequency 266652504 Hz quality 800
IPsec: Initialized Security Association Processing.
...

So much for the examples - the patches for -current as of 
20051024-094940(UTC) can be found
along with some before and after graphics on:

    http://www.kardel.name

Future work (roughly sorted):

    - BRANCH: verify that the time counter implementation works and can 
coexist with architectures
              that have no time counter support yet
              fix some utilities such as savecore, vmstat as time and 
monotime are gone
              make sure manual pages are complete
    - fold initial time counter code into -current for wider test
    - BRANCH: port other architectures to time counters (amd64, sparc 
are next on my list as I have
              that hardware available) - se below what need to be done
    - BRANCH: change internal timestamp usage to timespec or better yet 
to bintime from time counters
              convert to external formats only at external interfaces
    - fold time stamp conversion into -current for bug shakeout
    - import new ntpd into -current
    - BRANCH: cleanup time interpretation so intervals (intervalstimer, 
timer for keys etc.) to strictly use
              monotonic time where monotonic time is needed
    - fold cleaned up time usage into -current

    => time keeping cleanup done - probably after the next leap second ...

What is needed for a port to switch to time counters?

Basically just:
    - add __HAVE_TIMECOUNTER to machine/types.h
    - add options TIMECOUNTERS to conf/std.<arch>
    - provide an up-counting counter with fixed, known frequency
    - a counter reading function
    - initialize a timecounter structure around periodic clock interupt 
initialization
      and register it with tc_init(struct timecounter *)
    - provide periodic clock interrupts (just like now - no code change 
required for this)
    - call settime(struct timeval *) after determining the current
      time of the from TC or file system.
    -> thats it in a nutshell
   
Please feel free to look at the patch (consider it "newer as current" 
8-) and give me
feedback about the time counter port.

Regards,
  Frank