NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Peculier raidframe problems



I'll got a HP server running NetBSD 6.99.40 (amd64) with 2 pairs of disks each configured as raid 1 mirrors. The first pair of disks has three raid partitions arranged as mirrors on wd0 and wd1.

disklabel (wd0, wd1):

6 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
a: 83886080 2048 RAID # (Cyl. 2*- 83222*) c: 3907027120 2048 unused 0 0 # (Cyl. 2*- 3876020) d: 3907029168 0 unused 0 0 # (Cyl. 0 - 3876020) e: 83886080 83888128 RAID # (Cyl. 83222*- 166442*) f: 3739254960 167774208 RAID # (Cyl. 166442*- 3876020)

I noticed that raid2 (/dev/wd0f, /dev/wd1f) had a problem: There were disk errors and /dev/wd0f was marked as failed. I decided to fail /dev/wd0e and /dev/wd0a and replace the disk with an identical one disklabeled as before. I rebuilt the three raids and rebooted.

The raidsets raid1(/dev/wd0e, /dev/wd1e) and raid2(/dev/wd0f, /dev/wd1f) were fine but raid0 had /dev/wd1a was now marked as failed... this wasn't the disk that was replaced.

If I reconstruct /dev/wd1a from /dev/wd0a then everything appears fine, but after a reboot /dev/wd1a is marked as failed again. There are no disk errors reported

root(rakelane)root$ raidctl -R /dev/wd1a raid0
root(rakelane)root$ raidctl -s raid0
Components:
           /dev/wd0a: optimal
           /dev/wd1a: reconstructing
No spares.
Component label for /dev/wd0a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 10, Mod Counter: 279
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 83886016
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
/dev/wd1a status is: reconstructing.  Skipping label.
Parity status: clean
Reconstruction is 4% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
root(rakelane)root$ raidctl -s raid0
Components:
           /dev/wd0a: optimal
           /dev/wd1a: optimal
No spares.
Component label for /dev/wd0a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 10, Mod Counter: 280
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 83886016
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
Component label for /dev/wd1a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 10, Mod Counter: 280
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 83886016
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

after a reboot:

root(rakelane)root$ raidctl -s raid0
Components:
           /dev/wd0a: optimal
           /dev/wd1a: failed
No spares.
Component label for /dev/wd0a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 10, Mod Counter: 289
   Clean: No, Status: 0
   sectPerSU: 64, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 83886016
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Force
   Last configured as: raid0
/dev/wd1a status is: failed.  Skipping label.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

Anyone got a clue how to fix this problem...

Dave

--
============================================
Phone: 07805784357
Open Source O/S: www.netbsd.org
Caving: http://www.wirralcavinggroup.org.uk
============================================


Home | Main Index | Thread Index | Old Index