NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Disk errors that aren't disk errors?



Lloyd Parkes wrote:
Hi,
While testing some Windows backup software I encountered an issue with the NetBSD box that I was backing up to. The NetBSD box is an amd64 running 5.0_BETA and the disk is a SATA disk in a Vantec eSATA dock. I can read and write from the entire disk with dd without triggering any kernel messages. The file system is UFS with logging.

The following message log shows my boot messages and the error messages (it's just "grep /netbsd"). The errors always show up some multiple of two minutes and a few seconds apart. It's always the same block number and I/O performance turns to custard for about half a minute (where custard is 1MB/s instead of 8MB/s).

Given that dd works, but the mounted file system doesn't, I'm guessing that the filesystem is asking the device to do something that it can't do. It may be a coincidence, but the first error report was two minutes and a few seconds after the backup (via rsync) started writing to the disk. The disk had been mounted for some time before then. sync(8) did not trigger any error messages.

Disk performance was monitored with "iostat wd0 wd1 wd2 5".

Any ideas would be appreciated.
8<
Jan 19 20:20:44 maro /netbsd: wd2e: error writing fsbn 624993601 (wd2 bn 624995649; cn 305173 tn 42 sn 1), retrying
Jan 19 20:20:44 maro /netbsd: wd2: (aborted command)
Jan 19 20:20:44 maro /netbsd: wd2: soft error (corrected)
Jan 19 20:22:46 maro /netbsd: wd2e: error writing fsbn 624993601 (wd2 bn 624995649; cn 305173 tn 42 sn 1), retrying
Jan 19 20:22:46 maro /netbsd: wd2: (aborted command)
Jan 19 20:22:46 maro /netbsd: wd2: soft error (corrected)
8<

Hi,

When I was running 5.0_BETA on i386 I saw similar occurences of soft errors on a SATA disk. I traced it back to two atactl commands that I ran at startup to set standby time and idle time. Removing the atactl removed the errors.

This might help. I've now upgraded to RC1, and haven't used the atactl since, so can't confirm if it's still the same situation or not.

Best regards,

Phil


Home | Main Index | Thread Index | Old Index