Subject: Re: grub vs. raidframe
To: Jeff Rizzo <riz@NetBSD.org>
From: Mike M. Volokhov <mishka@apk.od.ua>
List: port-xen
Date: 11/11/2005 10:36:42
On Thu, 10 Nov 2005 08:28:24 -0800
Jeff Rizzo <riz@NetBSD.org> wrote:

> Mike M. Volokhov wrote:
> 
> >On 11/02/2005 14:34:18 EET Jeff Rizzo wrote [not subscribed to port-xen
> >before so this is not followup, sorry]:
> >  
> >
> >
> >3) GRUB must be installed on *all* disks of the RAID set.
> >  
> >
> 
> I guess I could have mentioned that;  I suppose I figured that anyone
> doing raid1 booting on netbsd was already doing that with netbsd
> bootblocks,and would figure it for RAID as well.

Well, I have mentioned that just to be complete. :-)

> >But before doing this we should determine what's a sort of disk
> >failures we're waiting and how we would respond on them. For examlpe,
> >I assume that if one of mirrored HDD will be failed, we will shutdown
> >the Dom0, replace a failed disk and boot up again from live HDD. (See
> >the reason?) This means that disks must be mirrored at all levels.) So
> >my /grub/menu.lst contains the following lines (stripped):
> >
> >title Xen 2.0.7 / NetBSD 3.0 (HDD1)
> >  root(hd0,g)
> >  kernel (hd0,g)/xen.gz dom0_mem=65536 
> >  module (hd0,g)/netbsd root=/dev/hda1 ro console=pc
> >title Xen 2.0.7 / NetBSD 3.0 (HDD2)
> >  root(hd1,g)
> >  kernel (hd1,g)/xen.gz dom0_mem=65536 
> >  module (hd1,g)/netbsd root=/dev/hdb1 ro console=pc
> >
> >title NetBSD/i386 3.0, MP ACPI (HDD1)
> >  root (hd0,g)
> >  kernel --type=netbsd /netbsd-GENERIC.MPACPI
> >title NetBSD/i386 3.0, MP ACPI (HDD2)
> >  root (hd1,g)
> >  kernel --type=netbsd /netbsd-GENERIC.MPACPI
> >
> >title NetBSD chain (HDD1:0)
> >  root        (hd0,0)
> >  chainloader +1
> >title NetBSD chain (HDD2:0)
> >  root        (hd1,0)
> >  chainloader +1
> >  
> >
> 
> I actually don't see the point of including menu entries for hd1;  if
> hd0 goes bad, hd1 will have to become hd0 to boot anyway.  I generally
> install the bootblock on both disks, but only menu items for the first.

Not necessarily; it's very dependent on sort of failure. Moreover, you
always should restore previous configuration back by simply replacing
failed disk. Let assume you have folowing setup:

	HDD0 - failed
	HDD1 - good

When *both* disks are presented in system, this would not be possible
to boot from the second one. Next, when you replace failed HDD thus
configuration becomes:

	HDD0 - repaired, good
	HDD1 - good

... you should have ability to boot from the second disk too.

And only when you just deconstruct the failed disk from the machine, so
it becomes:

	HDD0 - former HDD1, good

... that additional GRUB menu items becomes useless.

> >And have ran the following commands in the GRUB:
> >
> >grub> root (hd0,g)
> >grub> setup (hd0)
> >grub> root (hd1,g)
> >grub> setup (hd1)
> >
> >And last one note:
> >
> >4) Should we add these explainations to Ports/xen/howto.* ?
> >  
> >
> 
> Probably.  Be my guest.  :)

OK, will add it...

> I should note that I've experienced one intermittent problem: 
> sometimes, 'swapoff=YES' in /etc/rc.conf causes the machine to hang when
> removing block-type swap devices at shutdown. (Which is necessary if one
> is to avoid raid parity problems at boot when swapping to raidframe). 
> It seems to be related to how much memory dom0 has - it happens nearly
> 100% of the time when dom0 has 64M, maybe 30% when it has 128M.

Hmmm... Very interesting catch, but I cannot reproduce it on my setup.
I've workarounded swap-related problems with the following setup:

wd0, wd1: a - RAID, b - swap

raid0: partitioned without swap

/etc/fstab:
	/dev/raid0a	/	ffs	rw	1 1
	/dev/wd0b	none	swap	sw,dp	0 0
	/dev/wd1b	none	swap	sw	0 0
	/dev/raid0e	/usr	ffs	rw	1 2
	... and so on

Although, when I do a "sync" in kdb system hangs... :-/ Any advice on
how to handle it?

--
Mishka.