Subject: Re: RAIDFrame: Reconfiguring an array
To: Mark Cullen <mark.r.cullen@gmail.com>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 06/10/2006 12:21:49
Mark Cullen writes:
> Just a few questions to clarify really.
> 
> I have a simple RAID-1, configuration as follows:
> 
> ---
> START array
> 1 2 0
> 
> START disks
> /dev/wd1a
> /dev/wd3a
> 
> START layout
> 128 1 1 1
> 
> START queue
> fifo 100
> ---
> 
> It all works great, I can fail either disk and rebuild it using -R 
> /dev/wdXa. I am, however, a little confused over how I might go about 
> removing a disk from the array, should I ever want to (I have no idea if 
> I ever will, but I like to try to be semi-prepared at least).

If the array won't be used again, then just make sure you do a 
'raidctl -A no raid0' so that the components don't get auto-configured.
You can then do a 'raidctl -u raid0' to unconfigure.  If you want to 
nuke the labels (something that raidctl doesn't have an option for)
you can use something like:  

 dd if=/dev/zero of=/dev/wd1a bs=512 count=1 seek=32

> I've figured out that I can actually unconfigure the array, by doing a 
> `raidctl -u raid0`, and then I can force a reconfiguration by then doing 
> a `raidctl -C /root/raid.conf raid0`. I'm not sure if I have to do a 
> `raidctl -I <serial> raid0` and `raidctl -i raid0` after this, but it 
> doesn't seem to hurt anything, so I do these too. Is this correct?

If you reconfigure with -C, then you *must* do both -I and -i.

> Right, so now I unconfigure the array, and change my configuration to 
> remove one of the disks:
> 
> ---
> START array
> 1 2 0
> 
> START disks
> /dev/wd1a
> absent
> #/dev/wd3a
> 
> START layout
> 128 1 1 1
> 
> START queue
> fifo 100
> ---
> 
> After recreating the configuration, it seems to work ok (I can still 
> mount the device, and my test data is still there). I can't, however, do 
> a `raidctl -i raid0`. Doing such a thing, I get an error on the console:
> 
> "raid0: Error re-writing parity!"
> 
> Does this matter?

There's only one component, so it has no where to write the "parity" 
(i.e. "mirror", in this case).
 
> Right, so now I have just one disk on the array. I decide to copy some 
> more test data on to it while the second disk is missing. Works fine. I 
> add the missing disk back and reconfigure:
> 
> ---
> START array
> 1 2 0
> 
> START disks
> /dev/wd1a
> #absent
> /dev/wd3a
> 
> START layout
> 128 1 1 1
> 
> START queue
> fifo 100
> ---
> 
> Suprisingly, even though the data on these two disks now differs, the 
> status of the array is ok on both disks. I assume this is because I 
> forced a reconfiguration, and there's really no easy way to tell that 
> the data differs.

Yes...  "-C" is to be used for new arrays with nothing on them, or
when you want to say "I know better, trust me"...  

> A `raidctl -i raid0` succeeds (admittedly I have no 
> idea what it does, however),

For a RAID 1 configuration, it will verify the that the data bits 
are in sync... 

> but the data on the two disks is still 
> different, that is, /dev/wd3a still has the old copy of the data. 

Um... after the 'raidctl -i raid0' completes, the data parts of wd1a 
and wd3a had better be *exactly* the same!

> In 
> such a situation then, am I to force the re-added disk to "failed" and 
> rebuild it?

Yes.  With raid0 in operation with just wd1a, you could have done:

 raidctl -a /dev/wd3a raid0
 raidctl -vF absent raid0

and watched it rebuild :)

> Other than that, I seem to be able to get along with it just fine :-)

Good! :)

Later...

Greg Oster