netbsd-bugs: Re: kern/32035: APIC timer help

Subject: Re: kern/32035: APIC timer help
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Frederick Bruckman <fredb@immanent.net>
List: netbsd-bugs
Date: 12/19/2005 17:40:04
The following reply was made to PR kern/32035; it has been noted by GNATS.

From: fredb@immanent.net (Frederick Bruckman)
To: Simon Burge <simonb@wasabisystems.com>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 19 Dec 2005 11:38:38 -0600 (CST)

 In article <20051207070244.E209723402@thoreau.thistledown.com.au>,
 	Simon Burge <simonb@wasabisystems.com> writes:
 > Simon Burge wrote:
 > 
 >> [ local APIC timer problem discussed ]
 > 
 > I've come to the conclusion that for some reason on the problematic
 > machines the APIC timer just doesn't fire with the same period for some
 > unknown reason, and that there's nothing we can really do about.  The
 > patch at
 > 
 >    ftp://ftp.netbsd.org/pub/NetBSD/misc/simonb/mp-time-hack.diff
 > 
 > at least lets time run stably.  The main comment at the top of the patch
 > describes what it does:
 > 
 > 	* Some MP systems have been observed to not have a
 > 	* stable local APIC timer interrupt.  We count the
 > 	* number of TSC cycles since the last call to
 > 	* lapic_clockintr(), and if it has been longer than
 > 	* expected we add in some extract time for hardclock()
 > 	* to add in when it computes the next value of the
 > 	* system "time" variable.  Note that we don't skip
 > 	* time backwards - early arrivals to lapic_clockintr()
 > 	* have only been observed sporadically, and we'll
 > 	* soon catch up.
 > 
 > Longer term, switching to timecounters is a more correct fix since they
 > base time calculations on the TSC counter and not the period of the
 > clock interrupt.  Using HPET timers where available will also help.
 
 That sounds really interesting. The problem I see with your theory,
 is that it's the same APIC timer for the one CPU or two CPU cases.
 I suspect some latency in the IPI/read-TSC code path.  Maybe the
 "rdtsc" instruction simply isn't in the icache on the slow cycles?
 Experimenting as you suggest would help answer the question.
  
 > I'd be curious if anyone else with SMP boxes that have time keeping
 > problems could test this out and see if it fixes the time problem.
 
 It helps! The frequency (as logged in "/var/log/loopstats") jumps to
 a few hundred under heavy disk I/O, but then settles back down without
 stepping. (Patch applied to netbsd-3-0). Yet, on the same machine with
 a non-SMP kernel (2.1 to 3.0_RC6), the frequency slowly varies from
 about 5.0 to 11.0, depending on ambient temperature, so it's clearly
 not a complete fix.
 
 
 Frederick