netbsd-users: Re: RAIDFrame booting from RAID-1 & kernel dumps

Subject: Re: RAIDFrame booting from RAID-1 & kernel dumps
To: Mark Cullen <mark.r.cullen@gmail.com>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 06/30/2006 12:12:21
Mark Cullen writes:
> Mark Cullen wrote:
> > 
> > My question is probably reasonably obvious by now: is there any way that 
> > I can force the secondary master drive to be 'wd0' in the event that the 
> > primary master drive disappears (preferably without spending money on an 
> > IDE controller card, or a RAID card even)? I know I could re-order the 
> > drives so that the PS drive is disk #2 for the root device, but then 
> > both of the arrays would have both of their disks on the same IDE 
> > channel, and that would be quite a performance hit I imagine...
> > 
> > 
> 
> I may well have just answered my own question. The thought just popped 
> in to my head that I may be able to statically define wd2 and wd3 as the 
> disks on the home RAID array, and leave the other two to be detected 
> automagically, for example:
> 
> wd2     at atabus0 drive 1 flags 0x0000
> wd3     at atabus1 drive 1 flags 0x0000
> wd0     at atabus? drive ? flags 0x0000
> wd1     at atabus? drive ? flags 0x0000
> 
> This seems to work with the test box just fine :-) I removed the PM 
> drive and the SM took over as wd0. I put the PM back in and the SM drive 
> switched back to wd1. Anyone see anything majorly wrong with doing this?

Nope.  Before the days of "autoconfig" in RAIDframe, "nailing down" 
where drives lived was required -- you didn't want wd2 to become wd1, 
and suddenly your RAID set consisting of wd0f, wd1f, and wd2f getting 
royally messed up because now wd2f was where wd1f was supposed to 
be...

> 
> Another question though. With the dump device specified as wd0b in 
> fstab, I am still seeing dmesg print this out:
> 
> "root on raid1a dumps on raid1b"

That's just the "default message" that gets printed... (since it 
hasn't been told otherwise at that point, it assumes it'll be dumping 
on raid1b since raid1a is where / is...  it'll tell you that even if 
you don't have a raid1b...)

> Here's the fstab:
> 
> /dev/raid1a / ffs rw,softdep 1 1
> /dev/raid1e /var ffs rw,softdep 1 2
> /dev/raid1f /tmp ffs rw,softdep 1 2
> /dev/raid1g /usr ffs rw,softdep 1 2
> /dev/raid1b none swap sw 0 0
> /dev/wd0b none swap dp 0 0
> kernfs /kern kernfs rw
> procfs /proc procfs rw,noauto
> /dev/raid0a /usr/home.raid ffs rw,softdep 2 2
> 
> 
> Is this normal? What would happen if it did actually panic and try and 
> dump to raid1b? 

It won't/can't.  RAIDframe won't and can't deal with dumps.

> Would it just complain the device doesn't exist and not 
> dump, or destory all my data on that RAID volume?

The function that gets called to do the dump returns an ENXIO - 
device not configured.

In your other post you said:
> Unfortunately this appears not to be the case for me. Suppose I remove 
> the PM drive with this setup. The PS drive, one of the disks which is 
> actually part of the home RAID array, is now wd0 and *NOT* the SM disk. 
> I can see this being potentially quite dangerous, especially since I do 
> actually have a 'b' partition defined in disk label on one of the drives 
> in the home RAID array.

"don't use 'b' unless you intend to dump/swap to that partition".  
You should have 12 other letters to choose from -- pick one
other than 'a', 'b', 'c', or 'd' ;)  That goes for picking partitions 
on the RAID sets too :)  (yes, one could argue that there shouldn't 
be anything special about 'a' or 'b' or 'c' or 'd', but historically 
there has been, and it's just easier to "go with the flow" on this...)

Later...

Greg Oster