Subject: Re: wd interface CRC errors
To: David Maxwell <david@vex.net>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-help
Date: 10/20/2006 10:55:14
David Maxwell writes:
>
> I have a NetBSD 2.0.2, i386, with an uptime of 238 days. The install
> is about a year and a half old, running 24/7 since May 2005.
>
> It has three WDs, and wd1/wd2 are a mirror. Nothing has changed
> physically in the system, so the usual 'cable problem' suggestion
> doesn't seem to apply.
If you havn't already, I'd check that the cables are seated all the way...
(I've had them work their way out a little over time, and cause these
sorts of issues...)
> These started in September, and have become common:
>
> Sep 19 08:58:28 mail /netbsd: wd0a: error writing fsbn 1114304 of 1114304-111
> 4319 (wd0 bn 1114367; cn 1105 tn 8 sn 23), retrying
> Sep 19 08:58:29 mail /netbsd: wd0: (aborted command, interface CRC error)
> Sep 19 08:58:29 mail /netbsd: wd0: soft error (corrected)
>
> They show up on wd0 and wd1, which share a controller, and a cable.
> All errors are corrected so far. (18 on wd0, 19 on wd1)
>
> All of the errors occur while writing, and there's no locality
> amongst the sectors invovled in the errored writes.
Any "time-of-day" correlations (e.g. when /etc/daily is running?)
which might speak to a heavy disk load (and hence power draw), and
possibly to a power supply that is starting to fail?
> The smart status shows a high count on wd0 for raw read error rate and
> hardware ECC recovered errors, so I'm inclined to replace that drive.
Be careful with these numbers from the SMART info... I've got a few Seagate
drives where the raw read error rate and hardware ECC recovered error rate
move in lock-step, and at a rate of 6/second (when the drive is idle. Much
higher when the drive is active). I don't have the URLs handy, but
this is apparently a 'known issue' with some Seagate drives...
You might run some of the tools from sysutils/smartmontools to see if
they give any more info (and/or run the SMART diagnostic bits...).
Later...
Greg Oster