Subject: Re: SBC probs
To: Chris Mason <cmason@nando.net>
From: Scott Reynolds <scottr@edsi.org>
List: port-mac68k
Date: 09/03/1996 00:57:14
On Thu, 29 Aug 1996, Chris Mason wrote:

> The blocks were not bad when I first formatted the disk for MacBSD (under
> 1.1).  I've run for at least 6 months of light (not constant, say every
> weekend or so) use on 1.1 without a single medium error like this.  I
> installed a new kernel (NFS_22 I believe) and almost immediately began
> seeing this media error.

I know this may sound far-fetched, but I have no other explanation other
than a simple coincidence.

> That is when I reformatted the disk (and ran
> several extensive tests on it which reported no errors) and installed
> 1.2_beta on it.  It ran okay for about a week or so and then the media
> errors cropped up again.  The sectors I'm seeing now are quite different
> than the ones I first saw.

This last fact notwithstanding, you are looking at the typical pattern one
would expect for a true-blue medium failure.  The electronics on the drive
simply aren't sensitive enough to do the type of testing that is necessary
to certify the drive, which is why it's nearly always a very bad idea to
reformat the disk while discarding the current error info.

> Also, the errors that I'm seeing a quite random, wouldn't a head crash tend
> to effect blocks in one specific location on the disk??

Actually, it would be many contiguous blocks all over the disk... a bad
read/write head.  With ~20 bad blocks, this is not the problem.

> Could
> some part of the netbsd kernel other than the scsi driver have changed
> between 1.1 and 1.2_beta that might cause a problem like this?

No.  The way the SCSI bus works, either the entire block is transferred,
or it isn't; the drive electronics/firmware are 100% responsible for
getting it onto a platter (or not).  The kernel doesn't know how that's
done, and doesn't really care.  The only way I can imagine causing bogus
medium errors is to turn of the disk at the right instant, i.e. while it's
actually rewriting a sector. 

> Am I safe just running fsck and letting it take care of the bad
> blocks when it finds them or should I be doing something more drastic???

You should remap the bad blocks.  There was a quick hack that may do the
job for you, posted to current-users some time back, called "sdremap" I
believe.

> Also, aren't modern SCSI drives supposed to automatically remap bad blocks??

Only if you tell them to, which we do not.  I can't answer the obvious
question ("why not?").

--scott