Re: RAIDframe corruption

To: manu%netbsd.org@localhost (Emmanuel Dreyfus)
Subject: Re: RAIDframe corruption
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Mon, 29 Feb 2016 12:59:05 +0700

    Date:        Mon, 29 Feb 2016 05:59:05 +0100
    From:        manu%netbsd.org@localhost (Emmanuel Dreyfus)
    Message-ID:  <1mjd8z5.cz101sb8f9h0M%manu%netbsd.org@localhost>

  | Anyone has advice on how to cope with this?

Turn off wd0 and replace it.   Or at least, extract it and run some
torture tests to determine if it is really broken or not.

Raidframe has nothing to do with this - except it will allow your system
to keep working, with just (what was) wd1 until you get a replacement for
wd0, after which it will copy the data from wd1 to the replacement, and you
will be back working again.

Neither raidframe, nor any other raid, will cope with discs that silently
return bad data.   Its (their) purpose is to handle the case when the
disc reports failure.

Since it is unlikely that a word in a block changed while being read, and
the disc sector checksum failed to detect it (possible, but very unlikely)
the more likely cause is that something failed while writing the block,
before the sector checksum was calculated, so it could be ram or the pci
bus on your system.  Of course, if you are 100% certain that the block
could never have been written since it was known correct, that hypothesis
goes out the window, but you would have to be very certain - otherwise you
might want to suspect something in the system (power, ram, ...) other than
the drive, which might just be working perfectly.

kre

Follow-Ups:
- Re: RAIDframe corruption
  - From: Robert Elz
- Re: RAIDframe corruption
  - From: Emmanuel Dreyfus

References:
- RAIDframe corruption
  - From: Emmanuel Dreyfus

Prev by Date: Re: create keys and certificates for postfix/tls
Next by Date: Re: RAIDframe corruption
Previous by Thread: RAIDframe corruption
Next by Thread: Re: RAIDframe corruption
Indexes:

Home | Main Index | Thread Index | Old Index