Subject: Re: wedges vs. not-quite-wedges, was > 1T filesystems, disklabels, etc
To: Bill Studenmund <wrstuden@netbsd.org>
From: Greg Oster <oster@cs.usask.ca>
List: tech-kern
Date: 12/19/2002 16:39:38
Bill Studenmund writes:
> On Thu, 19 Dec 2002, Greg Oster wrote:
[snip]
> > I've mentioned this before in a few other places, but what I'd like to see
> > (e.g. for use w/ RAIDframe) is something like:
> 
> Please note: I think the below is fine IN TERMS OF HAVING AN LVM ON TOP.
> 
> >  1) The native disklabel looks "however it does" for that arch, and has zero
> > or more partitions "marked" as being "NetBSD".  (e.g. On i386, the MBR might
> > be the native disklabel with a primary or secondary partition(s) marked as
> > being "NetBSD".  Or the current disklabel might be the disklabel.   Whatever.
> > On Sun's, the existing sun disklabel would be the native label.)  The only
> > thing "NetBSD-specific" in a native label is some identification that
> > certain chunk(s) of the disk are "NetBSD".
> 
> Two questions. 1) I'd rather expect we'd really mark it, "NetBSD LVM".
> :-) 

"ok" :)

> 2) Why do we need a partition? 

Maybe we don't.. but what we will need is some way to figure out where the 
"NetBSD Metadata" lives.  Maybe we could make assumptions about where such 
metadata might live, but a partition would seem to be the simplest way.

> What do say Veritas or IBM's LVMs do?

I have no idea.

> >  2) In each of the chunks marked as being "NetBSD", we put in our own
> > wizz-bang fancy slice/wedge/partition/LVM/RAID-labelling scheme.  The
> > information stored there might be as simple as offset, size, and type,
> > or as complex as a complete RAIDframe component label, or perhaps
> > something even more compilicated than that (LVM record of some sort?).
> > The metadata stored here could refer to any/all blocks/partitions on the disk,
> > be they NetBSD-specific or not.
> 
> I think it is VERY DANGEROUS for us to refer to spaces outside of the
> "NetBSD" space. While we do it now in disklabels, as we migrate to more
> indirect allocation methods, I think it will be harder to keep things in
> sync.

mmmm... rope ;)

Are there situations where: 
 1) we need to boot from some native partition and 
 2) that native partition won't/can't actually live in "NetBSD space"?

If there arn't, then we probably don't need to allow the metadata to talk to 
anything outside that space.
 
> If we want to co-exist with other partitioning schemes, THAT'S COOL. If we
> want an LVM system, THAT'S COOL too. But let's NOT MIX THEM.
> 
> > The advantage is that once you get to the "metadata" specified in 2), it's
> > consistent across all NetBSD platforms.  The other advantage is that the
> > native disklabel now has much less NetBSD-specific stuff in it.  The primar
> y
> > disadvantage is that other OS's might have to learn how to grok the NetBSD
> > metadata.  But a little "NetBSD metadata" library would take care of that..
> .
> 
> Why do we need to get NetBSD-specific stuff out of the "native disklabel"?

Because NetBSD-specific stuff might not fit in native disklabels?  (e.g.  
native disklabels may not have fields large enough to specify "large" 
partitions?  Or disklabels might not be large enough to hold 256 partitions?)  
We also don't have to worry about storing NetBSD bits one way in 
one native label, and a different way in a different native label.  All we 
need to worry about is how to find this "NetBSD chunk" (or chunks..)
 
> I'd think it'd be easier to just teach all flavors of NetBSD about all
> disklabeling schemes we understand. 'cause there will be times when we
> care about non-NetBSD-specific stuff on non-native disks. Making a common
> library to understand other disklabel schemes is the only way to fix that,
> and once we do that, we can also deal with NetBSD-specific stuff. :-)

Except that we may still run into restrictions due to the legacy disklabels.

I guess I'm leaning towards a "let's make the new labelling scheme scalable 
towards LVM-like metadata".  That is, let's break free from the old disklabel 
scheme, and come up with something that deals with not only single partitions 
on obscure disks, but also with the metadata that might be required for RAID 
for LVM components.  We might only use it right now for talking about 
"partitions", but at least the framework would be there for stuffing in
RAIDframe component labels and what-not at a later date.

But perhaps that's too big of a step for now?

Later...

Greg Oster