Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

raidframe with R5/RS not reconstructing?



An experimental oversight caused one unit of my raid to get marked
as failed, even though nothing's wrong with the hardware.  It's now
running in degraded mode and all the data appear to be intact.

No problem, I thought.  'raidctl -R /dev/wd8a raid0' should put it back
to its old self.

But it doesn't.  It immediately reports 12% completion and sits there
(almost 9 hours now) with no apparent progress and no ETA computed (00:00).

I suppose this is the acid test for the little-used RAID 5 w/Rotated
Sparing option.  The last time I needed it (in NetBSD-4 days on sparc),
it worked just fine.


(Background:

RAIDframe RAID 5 w/Rotated Sparing across 8 1TB Hitachi SATA disks,
wd0-3a (add-in SATA card) and wd5-8a (onboard SATA ports).  wd4 is the
system disk (onboard PATA interface).

The machine has an EM64T-capable CPU, so it could run the amd64 port.
During initial testing with the amd64 version of the Jibbed LiveCD
and NetBSD_5.1, it proved to be unstable so I installed the i386 port,
which was stable.  I ultimately had to move to -current to get around
a RAIDframe/GPT bug.

Lately, I had been curious as to whether the amd64 port was usable now.
I built an amd64 release and arranged to netboot it from another machine
(which was always a backup plan in case the system disk in the RAID
system failed).  Recalling from previous experience that the device
minor (and some major?) numbers are different between i386 and amd64,
I remembered to './MAKEDEV all' in the guest /dev directory to overwrite
the old nodes.

What I forgot was that "all" for "wd" devices only makes the first 8
devices ([r]wd0-[r]wd-7).  I failed to recreate [r]wd8, so that node
still had the major/minor numbers for i386.  On i386, [r]wd8a-h has the
same first 8 minor numbers as amd64 [r]wd4a-h, so when raidframe tried
to attach wd8a, it saw wd4a (the OS system disk) instead, couldn't find
a RAID partition, etc. and recorded wd8a as failed.

Rebooting the standard i386 system from the machine's own local disk
still reports wd8a as failed.  Reconstruction seems not to be
happening as reported above.

Of course, I should've prevented the RAID from configuring in the test
setup.  20/20 hindsight and all that.)

So, um, help?

--
|/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
| X  No HTML/proprietary data in email.   BSD just sits there and works!
|/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645


Home | Main Index | Thread Index | Old Index