Subject: Re: media errors and packages
To: Dave Schmitt <dschmi1@umbc.edu>
From: Frederick Bruckman <fb@enteract.com>
List: port-mac68k
Date: 05/14/1999 05:57:11
On Thu, 13 May 1999, Dave Schmitt wrote:

> On Thu, 13 May 1999, John Fulmer wrote:
> 
> > It would help a little if you would post the errors. 
> 
> All the errors appear like the one below. There are 81 entries over the week
> (1 on 09-May, 2 on 12-May, the rest today -- 13-May) with around 50 unique
> "info" numbers (are these the bad sectors?). The power outages occurred on
> May 7.
> 
> May  9 03:20:31 clover /netbsd: sd0(sbc0:0:0): medium error, info = 955338
> (decimal), data = 07 16 00 68 11 00 00 00 00 00

Try `scsictl reassign 955338'. It'd be nice to know if that really works.
Of course if you succeed in remapping a block which is in use by a file,
you will lose the data in the file. Worse, if it's a directory. Much worse
if it contains cylinder metatdata.

> > Here's what I know.
> > 
> > Under BSD (and most of what I know is actually BSDI), SCSI 'medium' errors
> > can be either the disk, the SCSI cable, or the controller.

I use some old disks that were fished out of the trash, too. If kernel
option "SCSIVERBOSE" is on, you will get a message on a console when a
block is remapped. Some disks can't remap at all, though, or aren't set
to. Since you have FWB to tweak the mode pages, a good strategy is to to
set the disk to remap on write errors but not on read, if the disk will
take that setting. There are two advantages to that plan. 1) If you get a
medium error on reading an old data, you can try again. Once you remap,
the data is lost. 2) After a replacement on write, the next read gives an
error, not just zeroes. That way you know where the bad block is (was),
and you can delete and restore just that one file.

> > If it's the disk, it means that you have run out of remappable sectors,
> > and it can't remap no more. A low level format MIGHT take care of it.

It might mean that the disk isn't set to remap, or isn't capable of
remapping. OTH, if you run out of sectors while low-level formatting, the
format will fail, and that will be the end of it. There's typically only
two spare sectors assigned to track, and maybe one more track for the
whold disk, so you can run out of sectors for one particular track even
though you can still remap sectors in other tracks. The format will then
return an error, even though the disk isn't (yet) perfectly unusable.