Subject: Re: time issues in 4.0 rc4 on a PC-ENGINES WRAP (SC1100)
To: David Lord <netbsd@lordynet.org>
From: theo borm <theo_nbsdhelp@borm.org>
List: netbsd-help
Date: 11/29/2007 13:01:21
David Lord wrote:
>On 29 Nov 2007, at 0:09, theo borm wrote:
>
>
>
>>Hi,
>>
>>I seem have two problems with NetBSD 4.0 rc4, both in the kernel and
>>(putatively) in the userland, and was wondering if anyone has a clue
>>what to do about them.
>>
>>
>>1) Running a 4.0 rc 4 kernel on my PC-ENGINES WRAP board causes the
>>clock to run approximately 37 times too slow.
>>
>>The kernel is a custom one (derived from the SOEKRIS one) with options
>>TIMER_FREQ=1189200
>>
>>
>
>I've a few different m/b that needed option TIMER_FREQ in order to
>get frequency for use with ntpd within capture range. When I last
>updated them in August, I found with same kernel config previously
>used the frequency offset was way out, adjustments to TIMER_FREQ no
>longer had desired effect and when I removed option TIMER_FREQ I was
>getting near zero frequency offset. It seems there is now some
>working auto calibration.
>
>Probably worth trying without that option but 37x slow might be a
>different problem.
>
>David
>
>
Recompiling /without/ the TIMER_FREQ option seems equivalent to
changing the value to default (1193182 Hz)
The system clock runs a little bit slower (39x versus 37x) WITHOUT the
TIMER_FREQ option than with the TIMER_FREQ=1189200 option
I took the plunge and compiled a kernel with (what I thought) was a
reasonable correction factor (divide by 40) applied to the
TIMER_FREQ=1189200 option. This produced a kernel with a clock that runs
like mad.
cpu0: features 808131<FPU,TSC,MSR,CX8>
WARNING: broken TSC disabled
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 29730 Hz quality 100
timecounter: Timecounter "TSC" frequency 6652760 Hz quality 800
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
After a while a trickle of the following errors:
wdc0:0:0: lost interrupt
type: ata tc_bcount: 16384 tc_skip: 0
wd0a: device timeout reading fsbn 7107072 of 7107072-7107103 (wd0 bn
7107135; cn 3470 tn 17 sn 31), retrying
wd0: soft error (corrected)
turn into a stream of:
wdc0 channel 0: reset failed for drive 0
wdc0:0:0: wait timed out
wd0a: device timeout writing fsbn 2501312 of 2501312-2501343 (wd0 bn
2501375; cn 1221 tn 23 sn 31), retrying
(note: these errors do NOT occur with a kernel with a "more normal"
TIMER_FREQ)
plus erratically blinking leds on all LAN ports
sleep 1 also definitely sleeps LESS than one second now.
i.o.w: definitely *not* good.
After this I took a more cautious approach, and compiled a few kernels
with a range of settings:
TIMER_FREQ=1189200/2=594600
-> time/date runs ~ 18 times too slow
-> sleep 120 takes 60 seconds
timecounter: Timecounter "i8254" frequency 594600 Hz quality 100
timecounter: Timecounter "TSC" frequency 132888380 Hz quality 800
TIMER_FREQ=1189200/4=297300
-> time/date runs ~ 8 times too slow
-> sleep 240 takes 60 seconds
timecounter: Timecounter "i8254" frequency 297300 Hz quality 100
timecounter: Timecounter "TSC" frequency 66434860 Hz quality 800
TIMER_FREQ=1189200/8=148650
-> time/date runs ~ 3.2 times too slow
-> sleep 480 takes 60 seconds
timecounter: Timecounter "i8254" frequency 148650 Hz quality 100
timecounter: Timecounter "TSC" frequency 33226590 Hz quality 800
TIMER_FREQ=1189200/16=74325
-> time/date runs ~ 1.15 times too slow
-> sleep 960 takes 60 seconds
timecounter: Timecounter "i8254" frequency 74325 Hz quality 100
timecounter: Timecounter "TSC" frequency 16614960 Hz quality 800
in the last case the
wdc0:0:0: lost interrupt
type: ata tc_bcount: 16384 tc_skip: 0
wd0a: device timeout reading fsbn 7107072 of 7107072-7107103 (wd0 bn
7107135; cn 3470 tn 17 sn 31), retrying
wd0: soft error (corrected)
errors are back.
Is it safe to draw the conclusion that this is a bug in the kernel?
regards,
Theo