Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Raidframe can't seem to handle multiple simultaneous spare disks under raid5



        Hello.  It looks like raid5 raid sets can't handle multiple
simultaneous spares.  I've got a raid5 set, shown below, which has been
experiencing a number of sequential disk failures.  What I've found panics
the system  are the following steps:

1.  First disk breaks, say, /dev/wd2e, and /dev/wd8e is the spare.
So:
#raidctl  -a /dev/wd8e raid1
#raidctl -F /dev/wd2e raid1

2.  Without rebooting,  the raidctl -s output now shows the used_spare.

3.  /dev/wd6e breaks, and I've got a /dev/wd9e, not shown, waiting in the
wings to  spare.
#raidctl -a /dev/wd9e raid1
BOOM!  Panic and reboot.  

Unfortunately, I wasn't able to capture the panic string as it went down,
but since I was able to reproduce this twice, I think it's a software error
in the raiframe system.

        The key here is the fact that there was no reboot between steps 1 and
3.  If I had rebooted after the first reconstruction, I could hav added
another spare, and reconstructed to it, no problem.  So, my theory is that
the raidframe system can't handle multiple simultaneous  spare disks, at
least if you're using raid5 and you're adding them with the -a flag, as
opposed to using the config file to add spares.  

        This is under NetBSD-4, without Greg's latest changes to support
reconstructing to disks larger than  768GB.
        I'll see if I can set up a test system to reproduce this scenario, but
if someone else could test this, that would be great, as I'm pretty busy at
the moment.
-thanks
-Brian

#raidctl -s raid1
Components:
           /dev/wd1e: optimal
           /dev/wd2e: optimal
           /dev/wd3e: optimal
           /dev/wd4e: optimal
           /dev/wd5e: optimal
           /dev/wd8e: spared
           /dev/wd7e: optimal
Spares:
           /dev/wd6e: used_spare
Component label for /dev/wd1e:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Component label for /dev/wd2e:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Component label for /dev/wd3e:
   Row: 0, Column: 2, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Component label for /dev/wd4e:
   Row: 0, Column: 3, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Component label for /dev/wd5e:
   Row: 0, Column: 4, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
/dev/wd8e status is: spared.  Skipping label.
Component label for /dev/wd7e:
   Row: 0, Column: 6, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Component label for /dev/wd6e:
   Row: 0, Column: 5, Num Rows: 1, Num Columns: 7
   Version: 2, Serial Number: 2006042301, Mod Counter: 750
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 260945856
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid1
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.


Home | Main Index | Thread Index | Old Index