netbsd-users: Re: RAIDFrame booting from RAID-1 & kernel dumps

Subject: Re: RAIDFrame booting from RAID-1 & kernel dumps
To: None <netbsd-users@netbsd.org>
From: Mark Cullen <mark.r.cullen@gmail.com>
List: netbsd-users
Date: 06/30/2006 19:37:02
Greg Oster wrote:
> Mark Cullen writes:
> 
>>Mark Cullen wrote:
>>wd2     at atabus0 drive 1 flags 0x0000
>>wd3     at atabus1 drive 1 flags 0x0000
>>wd0     at atabus? drive ? flags 0x0000
>>wd1     at atabus? drive ? flags 0x0000
>>
>>This seems to work with the test box just fine :-) I removed the PM 
>>drive and the SM took over as wd0. I put the PM back in and the SM drive 
>>switched back to wd1. Anyone see anything majorly wrong with doing this?
> 
> 
> Nope.  

Excellent!

> Before the days of "autoconfig" in RAIDframe, "nailing down" 
> where drives lived was required -- you didn't want wd2 to become wd1, 
> and suddenly your RAID set consisting of wd0f, wd1f, and wd2f getting 
> royally messed up because now wd2f was where wd1f was supposed to 
> be...
> 
> 
>>Another question though. With the dump device specified as wd0b in 
>>fstab, I am still seeing dmesg print this out:
>>
>>"root on raid1a dumps on raid1b"
> 
> 
> That's just the "default message" that gets printed... (since it 
> hasn't been told otherwise at that point, it assumes it'll be dumping 
> on raid1b since raid1a is where / is...  it'll tell you that even if 
> you don't have a raid1b...)

Ah I see, ok. I assume this means that in the unlikely event I get a 
panic before it's properly set to wd0b, I won't be able to get a kernel 
dump?

 >
 >
>>What would happen if it did actually panic and try and 
>>dump to raid1b? 
> 
> 
> It won't/can't.  RAIDframe won't and can't deal with dumps.

Good to hear.

> 
> 
>>Would it just complain the device doesn't exist and not 
>>dump, or destory all my data on that RAID volume?
> 
> 
> The function that gets called to do the dump returns an ENXIO - 
> device not configured.

Ahh.

> 
> In your other post you said:
> 
>>Unfortunately this appears not to be the case for me. Suppose I remove 
>>the PM drive with this setup. The PS drive, one of the disks which is 
>>actually part of the home RAID array, is now wd0 and *NOT* the SM disk. 
>>I can see this being potentially quite dangerous, especially since I do 
>>actually have a 'b' partition defined in disk label on one of the drives 
>>in the home RAID array.
> 
> 
> "don't use 'b' unless you intend to dump/swap to that partition".  
> You should have 12 other letters to choose from -- pick one
> other than 'a', 'b', 'c', or 'd' ;)  That goes for picking partitions 
> on the RAID sets too :)  (yes, one could argue that there shouldn't 
> be anything special about 'a' or 'b' or 'c' or 'd', but historically 
> there has been, and it's just easier to "go with the flow" on this...)

That's actually quite a fantastic suggestion, you know. I was just a 
little weary of changing anything in disklabel incase it destroyed the 
other labels, and thus all of my data. It doesn't seem to though.

I'll go change the letter now just to make sure that, in the now 
impossible case where wd2 would become wd0 (it is hardwired now so I 
dont think it can ever happen, but still) *AND *I stumble upon a panic 
while I am replacing the disk, dump will never manage to destroy the 
data on that spare partition. If that makes any sense :-)

> 
> Later...
> 
> Greg Oster
> 
> 
> 


Thanks for answering all of my questions yet again!

Actually, while on the subject of kernel dumps, I did a test dump as 
described by the guide. I tried 'sync' first, and it printed something 
along the lines of "syncing disks", I think, and then just locked up 
solid. The 'reboot 0x104' method worked fine though. Any thoughts on 
this one?