Subject: Various disk errors just showed up in dmesg :(
To: None <netbsd-users@netbsd.org>
From: Mark Cullen <mark.r.cullen@gmail.com>
List: netbsd-users
Date: 06/28/2006 23:23:59
Well this is fantastic! All my hardware seems to be dieing on me as of 
late! It's just not funny any more!

---
Jun 27 22:25:48 bone /netbsd: wd1a: DMA error writing fsbn 118105728 of 
118105728-118105747 (wd1 bn 118105791; cn 117168 tn 7 sn 6), retrying
Jun 27 22:25:48 bone /netbsd: wd1: soft error (corrected)
Jun 27 22:26:05 bone /netbsd: wd1a: device fault writing fsbn 110309504 
of 110309504-110309535 (wd1 bn 110309567; cn 109434 tn 1 sn 32), retrying
Jun 27 22:26:05 bone /netbsd: wd1: soft error (corrected)
Jun 28 01:33:23 bone /netbsd: wd1a: device fault writing fsbn 119347360 
of 119347360-119347363 (wd1 bn 119347423; cn 118400 tn 3 sn 34), retrying
Jun 28 01:33:23 bone /netbsd: wd1: soft error (corrected)
Jun 28 01:33:34 bone /netbsd: wd1a: device fault writing fsbn 119412992 
of 119412992-119413023 (wd1 bn 119413055; cn 118465 tn 5 sn 20), retrying
Jun 28 01:33:35 bone /netbsd: wd1: soft error (corrected)
Jun 28 01:33:53 bone /netbsd: wd1a: device fault writing fsbn 758016 of 
758016-758047 (wd1 bn 758079; cn 752 tn 1 sn 0), retrying
Jun 28 01:33:53 bone /netbsd: wd1: soft error (corrected)
Jun 28 01:49:05 bone /netbsd: wd1a: device fault writing fsbn 119372416 
of 119372416-119372447 (wd1 bn 119372479; cn 118425 tn 1 sn 16), retrying
Jun 28 01:49:05 bone /netbsd: wd1: soft error (corrected)
Jun 28 01:56:36 bone /netbsd: wd1a: device fault reading fsbn 4992064 of 
4992064-4992159 (wd1 bn 4992127; cn 4952 tn 8 sn 7), retrying
Jun 28 01:56:38 bone /netbsd: wd1: soft error (corrected)
Jun 28 07:25:47 bone /netbsd: wd1a: device fault writing fsbn 13261056 
of 13261056-13261087 (wd1 bn 13261119; cn 13155 tn 13 sn 60), retrying
Jun 28 07:25:47 bone /netbsd: wd1: soft error (corrected)
Jun 28 07:25:59 bone /netbsd: wd1a: device fault writing fsbn 113115672 
of 113115672-113115675 (wd1 bn 113115735; cn 112217 tn 15 sn 54), retrying
Jun 28 07:26:00 bone /netbsd: wd1: soft error (corrected)
Jun 28 07:26:38 bone /netbsd: wd1a: DMA error writing fsbn 101559196 of 
101559196-101559199 (wd1 bn 101559259; cn 100753 tn 3 sn 46), retrying
Jun 28 07:26:38 bone /netbsd: wd1: soft error (corrected)
Jun 28 07:26:49 bone /netbsd: wd1a: device fault writing fsbn 110312256 
of 110312256-110312287 (wd1 bn 110312319; cn 109436 tn 13 sn 12), retrying
Jun 28 07:26:50 bone /netbsd: wd1: soft error (corrected)
Jun 28 07:27:07 bone /netbsd: wd1a: DMA error writing fsbn 113115672 of 
113115672-113115675 (wd1 bn 113115735; cn 112217 tn 15 sn 54), retrying
Jun 28 07:27:08 bone /netbsd: wd1: soft error (corrected)
Jun 28 09:18:04 bone /netbsd: wd1a: DMA error writing fsbn 110255456 of 
110255456-110255487 (wd1 bn 110255519; cn 109380 tn 7 sn 38), retrying
Jun 28 09:18:04 bone /netbsd: wd1: soft error (corrected)
Jun 28 09:30:22 bone /netbsd: wd1a: DMA error writing fsbn 1068672 of 
1068672-1068675 (wd1 bn 1068735; cn 1060 tn 4 sn 3), retrying
Jun 28 09:30:23 bone /netbsd: wd1: soft error (corrected)
Jun 28 13:14:55 bone /netbsd: wd1a: device fault writing fsbn 758016 of 
758016-758047 (wd1 bn 758079; cn 752 tn 1 sn 0), retrying
Jun 28 13:14:55 bone /netbsd: wd1: soft error (corrected)
Jun 28 17:32:33 bone /netbsd: wd1a: device fault writing fsbn 110255584 
of 110255584-110255615 (wd1 bn 110255647; cn 109380 tn 9 sn 40), retrying
Jun 28 17:32:34 bone /netbsd: wd1: soft error (corrected)
Jun 28 23:04:10 bone /netbsd: wd1a: error reading fsbn 2539072 of 
2539072-2539199 (wd1 bn 2539135; cn 2518 tn 15 sn 46), retrying
Jun 28 23:04:10 bone /netbsd: wd1: (obsolete (address mark not found), 
no media/write protected, id not found, uncorrectable data error)
Jun 28 23:04:12 bone /netbsd: wd1: soft error (corrected)
---

This disk is probably not even a month old either! Brand new Seagate 
7200.9. I've checked the smart status and it's showing no remapped bad 
blocks, which is reassuring...

---
SMART supported, SMART enabled
id value thresh crit collect reliability description                    raw
   1 116    6     yes online  positive    Raw read error rate 
  110385059
   3 100    0     yes online  positive    Spin-up time                   0
   4 100   20     no  online  positive    Start/stop count               17
   5 100   36     yes online  positive    Reallocated sector count       0
   7  70   30     yes online  positive    Seek error rate 
  10919452
   9 100    0     no  online  positive    Power-on hours count           509
  10 100   97     yes online  positive    Spin retry count               0
  12 100   20     no  online  positive    Device power cycle count       66
187 100    0     no  online  positive    Unknown                        0
189 100    0     no  online  positive    Unknown                        0
190  58   45     no  online  positive    Unknown 
707330090
194  42    0     no  online  positive    Temperature 
42 Lifetime max/min 0/31
195  78    0     no  online  positive    Hardware ECC Recovered 
98608344
197   1    0     no  online  positive    Current pending sector 
4294967295
198   1    0     no  offline positive    Offline uncorrectable 
4294967295
199 200    0     no  online  positive    Ultra DMA CRC error count      0
200 100    0     no  offline positive    Write error rate               0
202 100    0     no  online  positive    Data address mark errors       0
---


Is this most likely just the cable? That last error looks quite 
worrying... I do believe that I have another disk attached on the same 
cable as that drive though, so I would have thought I would also be 
seeing errors on wd0. Hmm....