Re: RaidFrame Raid-1 problem (can't ditch a failing disk)

On 2/26/10 8:39 AM, Greg Oster wrote:
On Fri, 26 Feb 2010 01:12:12 -0500
Louis Guillaume<>  wrote:


I have a strange problem replacing a drive from a RAID-1 RaidFrame set.
Here's some info:

# uname -mrs
NetBSD 5.0_STABLE i386

# raidctl -s raid0
             /dev/sd0a: failed
             /dev/sd1a: optimal
No spares.
/dev/sd0a status is: failed.  Skipping label.
Component label for /dev/sd1a:
     Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
     Version: 2, Serial Number: 20071216, Mod Counter: 280
     Clean: No, Status: 0
     sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
     Queue size: 100, blocksize: 512, numBlocks: 143638784
     RAID Level: 1
     Autoconfig: Yes
     Root partition: Yes
     Last configured as: raid0
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

# dmesg | grep sd0
sd0 at scsibus0 target 0 lun 0:<ModusLnk, ,>  disk fixed
sd0: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992
sd0: sync (12.50ns offset 62), 16-bit (160.000MB/s) transfers, tagged
raid0: Components: /dev/sd0a[**FAILED**] /dev/sd1a

# grep smartd.*sd0d /var/log/messages |tail -3
Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, opened
Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, is SMART capable.
Adding to "monitor" list.
Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, SMART Failure:

So we got a bad disk and I have to change it out. So I did the following:

Any help would be great!

Since what you're doing seems to be correct, I think we'e going to need
a photo or backtrace or whatever of the panic in order to figure out
what's gone wrong :(


Greg Oster

Ok - I was afraid of that. The problem is 100% reproducible, though, so it's easy to do. Here are the screenshots:

In this case, I had removed the failing drive, so we have sd0 on scsibus1. This drive normally shows up as sd1 on scsibus1, but IIRC that doesn't matter to RaidFrame, right? At any rate, the same thing happens with a new blank (identical) disk in scsibus0.

Thanks for looking...


