NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Troubleshooting correctable errors?
I recently adding a pair of disks (wd2/wd3) as a mirrored zpool to a
system, and shortly after doing so, discovered one of the DIMMs was bad,
which was causing some... issues.
After removing the bad DIMM, most of the issues went away, but I was
still seeing errors like this occasionally:
[ 660.618971] ahcisata0 port 5: device present, speed: 3.0Gb/s
[ 660.618971] wd3d: channel reset reading fsbn 3858134944 of
3858134944-3858135071 (wd3 bn 3858134944; cn 3827514 tn 13 sn 13), xfer
38, retry 0
[ 660.618971] wd3d: channel reset reading fsbn 3858135200 of
3858135200-3858135327 (wd3 bn 3858135200; cn 3827515 tn 1 sn 17), xfer
200, retry 0
[ 660.618971] wd3d: channel reset reading fsbn 3858135328 of
3858135328-3858135455 (wd3 bn 3858135328; cn 3827515 tn 3 sn 19), xfer
298, retry 0
[ 661.128968] wd3: soft error (corrected) xfer 38
[ 661.128968] wd3: soft error (corrected) xfer 200
[ 661.128968] wd3: soft error (corrected) xfer 298
Fine, looks like a dying disk - this didn't make too much sense to me
(as the disk was pretty new), but spinning rust doesn't last forever.
After putting it off for a few weeks (during which `zpool status` never
revealed any problems, even after a scrub), I finally just replaced the
disk. Sure enough, the errors are still there. (the ones above are from
the new disk, actually)
It looks to me like maybe there's a problem on the SATA controller? I'm
hoping someone can give me some suggestions of ways to troubleshoot this
- preferably without a fresh outlay of cash. :) I have two more free
ports on the SATA controller, which I will probably try using after
sending this email, but I'm wondering what else might be going on.
Suggestions of other stuff to look at gratefully accepted - as far as I
can tell, it's only this disk (three others seem OK), and all the errors
have been corrected.
It's a NetBSD 10.1/amd64 system with a XEN3_DOM0 kernel, and a Xen 4.20
hypervisor, if that matters.
+j
Home |
Main Index |
Thread Index |
Old Index