Port-macppc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ATA driver and the clock



On Sun, Jun 16, 2013 at 09:22:41PM -0500, Donald Lee wrote:
> I had a problem this last couple of days where I was seeing the HW clock run
> backwards, only under heavy (and only particular types of) disk load.
> 
> I was seeing this sort of thing in the log:
> 
> Jun 15 20:48:41 charm mdnsd: mDNSPlatformRawTime went backwards by 275 ticks; 
> se
> tting correction factor to 1462875659
> Jun 15 20:49:18 charm mdnsd: mDNSPlatformRawTime went backwards by 503 ticks; 
> se
> tting correction factor to 1462876162
> Jun 15 20:49:24 charm mdnsd: mDNSPlatformRawTime went backwards by 350 ticks; 
> se
> tting correction factor to 1462876512
> Jun 15 20:50:41 charm mdnsd: mDNSPlatformRawTime went backwards by 438 ticks; 
> se
> tting correction factor to 1462876950
> Jun 15 20:52:33 charm /netbsd: wdc0:0:0: lost interrupt
> Jun 15 20:52:33 charm /netbsd:  type: ata tc_bcount: 16384 tc_skip: 0
> Jun 15 20:52:33 charm /netbsd: wdc0:0:0: lost interrupt
> Jun 15 20:52:33 charm /netbsd:  type: ata tc_bcount: 2048 tc_skip: 0
> Jun 15 20:52:47 charm /netbsd: wdc0:0:0: lost interrupt
> Jun 15 20:52:47 charm /netbsd:  type: ata tc_bcount: 16384 tc_skip: 0
> Jun 15 20:52:48 charm mdnsd: mDNSPlatformRawTime went backwards by 315 ticks; 
> se
> tting correction factor to 1462877265
> Jun 15 20:53:29 charm mdnsd: mDNSPlatformRawTime went backwards by 179 ticks; 
> se
> tting correction factor to 1462877444
> 
> 
> I also could run "top -s 1" and see the clock "stop" while the disk
> load was active.  When the activity stopped, the clock would resume.
> 
> I was afraid this was HW, and it was - the ATA cable.
> 
> I swapped out the cable, and now the problem is (almost?)
> gone.
> 
> Me thinks that a bad cable should cause disk errors, and maybe

You have a "disk error" in the log you posted above: the lost interrupt.
The disk is not at fault, it's a communication issue between the disk and
controller. And the driver manager to recover from this situation
(but you're luky :)

> delays, but mess up the clock and make it run backwards?
> Not so much.
> 
> The ATA (wd) driver must be turning off interrupts in some brutal
> way, and then wandering off to handle (long duration)
> errors.

It does; the disk reset to recover from this situation can be using
delay(9). Maybe it could be done in some other way, but I think it's wouldn't
be a good thing to add complexity in this code (as there's basically no
way to test it on demand) to "fix" a rare problem which, really,
needs to be fixed at the hardware level anyway.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index