tech-kern: Re: dev_t changes & partitions

Subject: Re: dev_t changes & partitions
To: Wolfgang Solfrank <ws@tools.de>
From: Bill Studenmund <wrstuden@vali.stanford.edu>
List: tech-kern
Date: 01/15/1998 13:46:48
On Thu, 15 Jan 1998, Wolfgang Solfrank wrote:

> > Have we even decided that we want "slices"?  I don't want to propagate MBR
> > braindamage to ports that don't have to deal with it..
> 
> Hmm, maybe I'm just stupid (again?), but can anyone please tell me what this
> "slices" would buy us?  I cannot see _any_ advantage this might get me...

The advantage, as I see it, is that it lets you deal with multiple OS's on
an MBR disk. Right now, we look through the MBR for a NetBSD partition,
and then look in there for the disklabel for the whole disk disklabel.
FreeBSD looks for a FreeBSD partition, and looks in there for a FreeBSD
disklabel. Same with OpenBSD and Linux. One problem is that each OS will
then basically ignore the other OS's partitions, a bad thing if you want
to inter-operate between OS's.

With a slice system, each MBR partition comes to life on its own, with its
own disklabel (reflecting what's in that MBR partition). Thus you can get
at linux partitions, FreeBSD partitions, or even entries in a DOS extended
partition.

Jason's right that this question's orthogonal to the dev_t one, but I'd
like to put forward a proposal just so folks can stop worrying about it
for a while. Also, because I'd like us to go w/ 6 bits for subunits, which
would leave no obvious bits for slices.

I liked the idea of making slices vnodes on the fly. My idea is, for each
disk drive which will support slices (wd and sd), there is a companion,
sliced major device. The sliced device would support 4 sub-units of the
un-sliced device.

The only real difference between this idea and automatically sticking a
vnode on each MBR partition is that the unit number of the "sliced" device
and the "unsliced" device would match, and that the slice is selected from
upper bits in the sub_unit field. As I understand vnd's, each vnd would
get its own "unit" number. 

When the disklabel for the real ("unsliced") device is read, it would
configure the sliced device appropriatly. Appropriately means setting up
slices on an MBR drive or setting up a pass-through layer for a non-MBR
drive (so wd0a could always point to the "sliced" layer and things would
work irregardles of the disklabel type).

I really don't like stuffing slices in the same device as the "real" disk.
When you have slicing happening, you're talking about having multiple
layers of disklabels. Without some sort of two-layer scheme, we have to
tuck two or more disklabels into one unit. That strikes me as kludgy,
messy, and bound to be a pain. Also, I agree that we really want to leave
ourselves as many units as we can (think running NetBSD on a multi-headed
husge server. It's a cool thought! :-)

So if the disk has an MBR disklabel, the sliced drive would be broken into
4 chunks (one per possable MBR partition), each subsection's disklabel is
read if we understand it. Then each chunk gets set up by the relevant
disklabel. If the disk has another type of label, we just set up a 1-to-1
correspondance between "sliced" and "unsliced" partitions.

The advantages of this idea are: 1) that it establishes a hierarchy of
disklabeling, representing a real hierarchy already present on disk. 2) we
can do it with 6 bits of sub-unit permitting: more disk units, and 4
slices of 16 partitions each.

As an aside, we could add a partition type to the "NetBSD" disk label
which will indicate that partition should be considered a slice with 16
partitions. The "NetBSD" disklabel only can hold 22 partitions. This way
we could stick extra partitions in the partition 32->63 region. You could
have two of these things, and the two together would fill the region.

So here's a way to have slices using only 6-bits for sub-unit. Let's just
go with a 12/14/6 split for devices, and do this later. I realize this
description's a bit fuzzy, and will explain more if others want.

Take care,

Bill