Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Broken 6.0.1 RAIDframe



I've set up a NetBSD/i386 6.0.1 system with its root partition on a RAID-1 RAIDframe volume and somehow managed to get it into a broken state. This is only a test system containing no important data, so it doesn't matter if it can't be fixed, but I'd be interested to know if it can be for reference, please.

The RAID-1 array is composed of wd0a and wd1a.

# uname -mrs
NetBSD 6.0.1 i386
#
# raidctl -s raid0
Components:
           /dev/wd0a: optimal
          component1: failed
No spares.
Component label for /dev/wd0a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2013001, Mod Counter: 135
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 39100160
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
component1 status is: failed.  Skipping label.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
#
# raidctl -g component1 raid0
Component label for component1:
   Row: 0, Column: 0, Num Rows: 0, Num Columns: 0
   Version: 0, Serial Number: 0, Mod Counter: 0
   Clean: No, Status: 0
   sectPerSU: 0, SUsPerPU: 0, SUsPerRU: 0
   Queue size: 0, blocksize: 0, numBlocks: 0
   RAID Level:
   Autoconfig: No
   Root partition: No
   Last configured as: raid0
#
# disklabel wd0 | tail -10
disklabel: partitions a and b overlap
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

5 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
 a:  39100288      2048       RAID                     # (Cyl.      2*-  38791)
b: 395095 38707177 swap # (Cyl. 38399*- 38791*)
 c:  39100288      2048     unused      0     0        # (Cyl.      2*-  38791)
d: 40132503 0 unused 0 0 # (Cyl. 0 - 39813*)
#
# disklabel wd1 | tail -10
disklabel: partitions a and b overlap
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
 a:  39100288      2048       RAID                     # (Cyl.      2*-  38791)
b: 395095 38707177 swap # (Cyl. 38399*- 38791*) c: 40130455 2048 unused 0 0 # (Cyl. 2*- 39813*) d: 40132503 0 unused 0 0 # (Cyl. 0 - 39813*)
#

I don't believe wd1 is faulty so I tried to bring it back into the array using raidctl's -R switch:

# raidctl -R component1 raid0
# tail -1 /var/log/messages
Apr 25 03:51:42 bs5t /netbsd: raid0: rebuilding: dk_lookup on device: component1 failed: 2!
#

Is it possible to remove wd1 from the array somehow, add it as a hot spare, then use -F to reconstruct onto it?

After failing to improve the situation, I tried starting the machine with just wd0 attached and then with just wd1 attached. Does that leave the RAID array in an inconsistent state when both disks are connected again? Is there a record kept of which disk was used most recently so its contents can be considered to be correct and will overwrite its partner's when a reconstruction occurs?

By the way, I noticed the MBR partition 6.0.1's sysinst creates has a 2048-sector offset instead of the 63 sectors I'm used to.

# fdisk wd0
Disk: /dev/rwd0d
NetBSD disklabel disk geometry:
cylinders: 39813, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
total sectors: 40132503

BIOS disk geometry:
cylinders: 1023, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
total sectors: 40131504

Partitions aligned to 16065 sector boundaries, offset 63

Partition table:
0: NetBSD (sysid 169)
    start 2048, size 39100288 (19092 MB, Cyls 0/32/33-2434/1/63), Active
1: <UNUSED>
2: <UNUSED>
3: <UNUSED>
Bootselector disabled.
First active partition: 0
#

I guess that's to accomodate more information, but what's an example of that?


Ray


Home | Main Index | Thread Index | Old Index