NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RAIDframe corruption



    Date:        Mon, 29 Feb 2016 05:59:05 +0100
    From:        manu%netbsd.org@localhost (Emmanuel Dreyfus)
    Message-ID:  <1mjd8z5.cz101sb8f9h0M%manu%netbsd.org@localhost>

  | Anyone has advice on how to cope with this?

Turn off wd0 and replace it.   Or at least, extract it and run some
torture tests to determine if it is really broken or not.

Raidframe has nothing to do with this - except it will allow your system
to keep working, with just (what was) wd1 until you get a replacement for
wd0, after which it will copy the data from wd1 to the replacement, and you
will be back working again.

Neither raidframe, nor any other raid, will cope with discs that silently
return bad data.   Its (their) purpose is to handle the case when the
disc reports failure.

Since it is unlikely that a word in a block changed while being read, and
the disc sector checksum failed to detect it (possible, but very unlikely)
the more likely cause is that something failed while writing the block,
before the sector checksum was calculated, so it could be ram or the pci
bus on your system.  Of course, if you are 100% certain that the block
could never have been written since it was known correct, that hypothesis
goes out the window, but you would have to be very certain - otherwise you
might want to suspect something in the system (power, ram, ...) other than
the drive, which might just be working perfectly.

kre



Home | Main Index | Thread Index | Old Index