Subject: Re: "frequency error ... exceeeds tolerance"
To: None <port-alpha@NetBSD.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-alpha
Date: 08/23/2007 17:37:58
[tnn@]
> X-Hidden-Message: lbh unir jnl gbb zhpu fcner gvzr

Qb V ernyyl?  (V'ir orra jbexvat ba grnpuvat zlfrys gb ernq ebg13 ng
fvtug, naq, juvyr V'z abg ernyyl syhrag, V qvqa'g arrq gb erfbeg gb n
cebtenz be purng-furrg gb ernq lbhe zrffntr.  Uzz, znlor V *qb* unir
gbb zhpu gvzr ba zl unaqf....)

>>> The fix on netbsd-3 is to define the CLKF_BASEPRI macro to 0.
>> [...]  How does that fix this problem?
> If I understand things correctly, it avoids short circuiting
> softclock handling from the hardclock handler. The palcode that
> invokes hardclock doesn't seem to be reentrant, and this makes the
> system lose hardclock ticks.

Perhaps.  But that alone doesn't seem to be the problem; see below.

[Izumi Tsutsui]
> I saw the similar errors on testing timecounter(9) support on alpha
> and "options HZ=1024" fixed the problem.

I added that to my kernel config, and it seems to have cured the
problem.  I got one "setclock:" message shortly after boot and that's
in, and NTP appears to be disciplining the clock very nicely now.

This seems to indicate to me that it's not just losing hardclock ticks
as tnn@ suggested, because I didn't touch CLKF_BASEPRI and thus should
still have any problem it led to before.

This doesn't feel right to me.  Just changing HZ shouldn't make a
problem like this appear or disappear, it seems to me; I suspect this
is a symptom of something more fundamental.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B