NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
wedges and raidframe recovery
I'm setting up three 2TB disks in a RAID-5 array, on amd64/5.1_rc2. During
testing, I ran into what I suspect is a serious problem.
Because of the size of the disks, as best I can tell I have to use wedges
instead of disklabel. After poking around and fighting it a bit, I managed to
set up dk0, dk1, and dk2 as the three wedges, each of which is all of one disk.
I then successfully configured the raid1 device across those three dk devices,
and configured a fourth wedge, dk3, to describe all of raid1. I was then able
to build a file system:
# raidctl -s raid1
Components:
/dev/dk0: optimal
/dev/dk1: optimal
/dev/dk2: optimal
No spares.
Component label for /dev/dk0:
Row: 0, Column: 0, Num Rows: 1, Num Columns: 3
Version: 2, Serial Number: 2010061200, Mod Counter: 187
Clean: No, Status: 0
sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 3907028992
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid1
Component label for /dev/dk1:
Row: 0, Column: 1, Num Rows: 1, Num Columns: 3
Version: 2, Serial Number: 2010061200, Mod Counter: 187
Clean: No, Status: 0
sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 3907028992
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid1
Component label for /dev/dk2:
Row: 0, Column: 2, Num Rows: 1, Num Columns: 3
Version: 2, Serial Number: 2010061200, Mod Counter: 187
Clean: No, Status: 0
sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 3907028992
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid1
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
# df
Filesystem 1024-blocks Used Avail %Cap Mounted on
/dev/raid0a 302652118 29210740 258308774 10% /
kernfs 1 1 0 100% /kern
ptyfs 1 1 0 100% /dev/pts
procfs 4 4 0 100% /proc
/dev/dk3 3787858122 2 3598465214 0% /shared
During some testing involving removing cables, dk2 was perceived as "failed" by
the RAIDfrom code. Ah -- a perfect opportunity to test recovery. The problem
is that I couldn't make it work; no matter what I tried, I could not induce the
kernel to start recovery on that wedge. dmesg showed complaints about being
unable to open the device, with 16 -- EBUSY -- as the error code.
First -- should this have worked? Second -- has anyone every tried this sort
of configuration? Third -- my suspicion is that I was getting EBUSY because of
the multiple layers of wedges, which left the disks appearing to be busy. But
if that's correct, there's no way to recover, which is not acceptable.
--Steve Bellovin, http://www.cs.columbia.edu/~smb
Home |
Main Index |
Thread Index |
Old Index