Subject: Re: RAIDFrame and NetBSD/sparc booting issues
To: Greg Oster <oster@cs.usask.ca>
From: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
List: port-sparc
Date: 08/13/2003 16:27:14
	Hello.  I've followed this entire discussion with much interest and
thought more about the way raiframe works now and the question of how to
make things work more transparently on raid-1 systems with existing boot
roms.  I realized I have a couple of questions and would like to make sure
my understanding of the current layout of raidframe partitions is correct.

1.  It's my understanding that the area protected by RF_PROTECTEDSECTORS is
designed to include such things as the physical disklabel, any boot
strapping code that might reside on the physical disk, and the raid
component label itself.  Thiss would imply that disklabel -r sd0 or wd0
should read the label out of this protected region, assuming the raid
partition includes the entire disk.  Is this right?

2.  It looks to me like most of the boot loaders work in such a way that
the first stage loader has the block numbers of the second stage loader
hard coded into them, meaning that the second stage loader could be loaded
from any portion of the disk, including the first portion of an ffs
filesystem inside a raid-1 partition.  Once the second stage loader is
loaded, I believe space and code constraints are sufficiently removed, that
the second stage loader could properly locate a kernel inside a raid-1 set
or a physical disk directly -- no?  If this is so, then it strikes me as
easier to teach the second stage boot loader how to locate a kernel either
in an FFS filesystem in a raid-1 set or in an FFS filesystem in a physical
partition.  Of course, once the kernel is loaded, it can already find the
FFS filesystems inside any raid sets, so that problem is solved.
(I should note, that it would also be necessary for the installboot program
to know about FFS filesystems inside raid-1 partitions as well, just so it
can plug in the right numbers for the second stage loader, even if that
loader is inside an FFS filesystem in a raid-1 set.  (Presumably, it could
also locate the loader in a raid-5 set, but of course that wouldn't
actually boot unless the kernel happened to fit inside one of the stripes
of the raid, but that's an entirely different problem :).))

	If that problem is solved, I fail to see the need for moving a
component label around and thus having to special case the raid-1 instance
inside the raidframe code.  This, would, I believe, free up Greg O's time
to look into issues like:

1.  Determining the feasibility of changing the autoconfiguration code to
account for hot standby partitions, and being able to auto-reconstruct into
them in the event of a component failure without user intervention.

2.  Examine why paging to a raid-5 set causes hangs.

	My point here is that I believe this discussion started because there
is some concern that teaching a system to boot from a raid-1 set is not as
straightforward as it should be.  Unless I'm gravely mistaken, and please
tell me if I am, this deficiency can be met by modifying the
orders-of-magnetude less complicated boot loder programs for the various
architectures than by modifying the raidframe system itself.  Is there
another piece that I don't understand?
-Brian