Subject: Data errors on hard drive
To: None <netbsd-users@netbsd.org>
From: Brian de Alwis <bsd@cs.ubc.ca>
List: netbsd-users
Date: 04/11/2003 12:33:00
My hard drive appears to be beginning to develop some bad sectors.
There are three sectors consistently causing data errors: 11499608,
11499610, and 11499611.  There were some other sectors that came
up previously, all on the same cylinder and tracks, but they've
disappeared since (drive magically whisked them away, I suppose).

I had some questions I hoped somebody here might answer. 

 1. I'm guessing this is pretty serious :-/  Especially given those
    magically-whisked away sectors; I thought the drive should have
    remapped them automagically without ever showing a read error?
    Should I be buying a new drive ASAP?

 2. Is there any way of fixing this?  I've never played with
    bad144, and the few messages I could find had some reservations
    about its use.

 3. Is there a convenient way of mapping the block numbers to an
    actual file?  I couldn't figure out easy magic from fsdb.  I'm
    doing frequent backups, but it would be lovely to know which
    files are affected!

I've appended both the console output when hitting one of these
errors, an extract from the dmesg bootup, and the disklabel.

Thanks for any help.

------------------------------------------------------------
wd0d: error reading fsbn 11499608 of 11499608-11499623 (wd0 bn 11499608; cn 11408 tn 5 sn 29), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499608 of 11499608-11499623 (wd0 bn 11499608; cn 11408 tn 5 sn 29), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499608 of 11499608-11499623 (wd0 bn 11499608; cn 11408 tn 5 sn 29), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499608 of 11499608-11499623 (wd0 bn 11499608; cn 11408 tn 5 sn 29), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 of 11499608-11499623 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 of 11499608-11499623 (wd0 bn 11499611; cn 11408 tn 5 sn 32)wd0: (uncorrectable data error)

wd0d: error reading fsbn 11499610 of 11499610-11499625 (wd0 bn 11499610; cn 11408 tn 5 sn 31), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499610 of 11499610-11499625 (wd0 bn 11499610; cn 11408 tn 5 sn 31), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499610 of 11499610-11499625 (wd0 bn 11499610; cn 11408 tn 5 sn 31), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499610 of 11499610-11499625 (wd0 bn 11499610; cn 11408 tn 5 sn 31), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 of 11499610-11499625 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 of 11499610-11499625 (wd0 bn 11499611; cn 11408 tn 5 sn 32)wd0: (uncorrectable data error)

wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32), retrying
wd0: (uncorrectable data error)
wd0d: error reading fsbn 11499611 (wd0 bn 11499611; cn 11408 tn 5 sn 32)wd0: (uncorrectable data error)
------------------------------------------------------------

dmesg on boot:
------------------------------------------------------------
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <IBM-DJSA-210>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 9590 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 19640880 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA data transfers)
pciide0: secondary channel wired to compatibility mode
pciide0: disabling secondary channel (no drives)
------------------------------------------------------------

disklabel:
------------------------------------------------------------
# /dev/rwd0d:
type: unknown
disk: slab
label: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 19640880
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0		# microseconds
track-to-track seek: 0	# microseconds
drivedata: 0 

8 partitions:
#        size    offset     fstype  [fsize bsize cpg/sgs]
 a:   4899825   4401810     4.2BSD   2048 16384   328   # (Cyl. 4366*- 9227*)
 b:    510048  19130832       swap                      # (Cyl. 18979 - 19484)
 c:  15239070   4401810     unused      0     0         # (Cyl. 4366*- 19484)
 d:  19640880         0     unused      0     0         # (Cyl.    0 - 19484)
 e:   9829197   9301635     4.2BSD   2048 16384   328   # (Cyl. 9227*- 18978)
 g:   4096575    305235      MSDOS                      # (Cyl.  302*- 4366*)
------------------------------------------------------------

-- 
"Source code in files. How quaint." - Kent Beck
"Maybe this world is another planet's Hell." - Aldous Huxley