NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: wedges and raidframe recovery



On Mon, 14 Jun 2010 18:34:01 -0400
Steven Bellovin <smb%cs.columbia.edu@localhost> wrote:

> I'm setting up three 2TB disks in a RAID-5 array, on amd64/5.1_rc2.
> During testing, I ran into what I suspect is a serious problem.
> 
> Because of the size of the disks, as best I can tell I have to use
> wedges instead of disklabel.  After poking around and fighting it a
> bit, I managed to set up dk0, dk1, and dk2 as the three wedges, each
> of which is all of one disk.  I then successfully configured the
> raid1 device across those three dk devices, and configured a fourth
> wedge, dk3, to describe all of raid1.  I was then able to build a
> file system:
> 
> # raidctl -s raid1
> Components:
>             /dev/dk0: optimal
>             /dev/dk1: optimal
>             /dev/dk2: optimal
> No spares.
> Component label for /dev/dk0:
>    Row: 0, Column: 0, Num Rows: 1, Num Columns: 3
>    Version: 2, Serial Number: 2010061200, Mod Counter: 187
>    Clean: No, Status: 0
>    sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
>    Queue size: 100, blocksize: 512, numBlocks: 3907028992
>    RAID Level: 5
>    Autoconfig: Yes
>    Root partition: No
>    Last configured as: raid1
> Component label for /dev/dk1:
>    Row: 0, Column: 1, Num Rows: 1, Num Columns: 3
>    Version: 2, Serial Number: 2010061200, Mod Counter: 187
>    Clean: No, Status: 0
>    sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
>    Queue size: 100, blocksize: 512, numBlocks: 3907028992
>    RAID Level: 5
>    Autoconfig: Yes
>    Root partition: No
>    Last configured as: raid1
> Component label for /dev/dk2:
>    Row: 0, Column: 2, Num Rows: 1, Num Columns: 3
>    Version: 2, Serial Number: 2010061200, Mod Counter: 187
>    Clean: No, Status: 0
>    sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
>    Queue size: 100, blocksize: 512, numBlocks: 3907028992
>    RAID Level: 5
>    Autoconfig: Yes
>    Root partition: No
>    Last configured as: raid1
> Parity status: clean
> Reconstruction is 100% complete.
> Parity Re-write is 100% complete.
> Copyback is 100% complete.
> # df
> Filesystem 1024-blocks       Used      Avail %Cap Mounted on
> /dev/raid0a   302652118   29210740  258308774  10% /
> kernfs                1          1          0 100% /kern
> ptyfs                 1          1          0 100% /dev/pts
> procfs                4          4          0 100% /proc
> /dev/dk3     3787858122          2 3598465214   0% /shared
> 
> During some testing involving removing cables, dk2 was perceived as
> "failed" by the RAIDfrom code.  Ah -- a perfect opportunity to test
> recovery.  The problem is that I couldn't make it work; no matter
> what I tried, I could not induce the kernel to start recovery on that
> wedge.  dmesg showed complaints about being unable to open the
> device, with 16 -- EBUSY -- as the error code.

What was the raidctl command you used?  I'd have expected

 raidctl -R /dev/dk2 raid1

to have worked. (i.e. to rebuild that 'disk' in place...)

> First -- should this have worked?  Second -- has anyone every tried
> this sort of configuration?  Third -- my suspicion is that I was
> getting EBUSY because of the multiple layers of wedges, which left
> the disks appearing to be busy.  But if that's correct, there's no
> way to recover, which is not acceptable.

I don't know why it shouldn't have worked...  I also don't know if
anyone has tried this sort of configuration before...

Later...

Greg Oster


Home | Main Index | Thread Index | Old Index