Subject: Re: partition bookkeeping
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 09/22/1999 16:11:03
On Wed, 22 Sep 1999, der Mouse wrote:

> >   => stuff I wrote
> >>  => stuff Frank wrote
> >>> => stuff der Mouse wrote

> >>> Yeah...but I still don't see any need for a "raw partition" with
> >>> wedges.  Just access the underlying device directly rather than
> >>> doing anything with the wedge partition devices.
> > So you're sugegsting that the major number for the wedges will not be
> > the same as the major number for the device itself?
> 
> That's certainly what I was expecting; until you said this, it had
> never occurred to me to question it.  Just like vnd and ccd and raid
> devices, wedge devices would have wedge driver majors.
> 
> > That's not something I took out of the initial wedge proposal, and is
> > something I'd object to.
> 
> Why do you object to it?

Because it adds what in my opinion is a wasteful step for little gain.
As I understood it, wedges will totally replace in-core disklabels - to
get at on-disk partitions, you need to use wedges. That's fine as I think
having two live partitioning schemes would be gross. :-)

With wedges using their own major number, then everytime we do i/o on a
wedge partition (i.e. eventually all disk i/o), we'll have to enter the
wedge driver, do some stuff to figure out which real driver we need, and
then call it. That's an extra subroutine call (and kernel stack growth,
etc) which isn't needed.

With the wedge bookkeeping table == an in-core "struct disklabel"
(remembering that the in-core one's now not necesarily related to an
actual on-disk struct disklabel), all the current drivers have to do is
just look in a bigger table than they use now. i.e. the number of
partitions goes up to 64 from 8 or 16.

Whenever we do i.o on a wedge in the latter scheme, we've already entered
the correct driver. We just need to scan the table. :-)

> > While the major numbers for the devices with wedges will be different
> > from our current major numbers for the same type of device,
> 
> I don't see how this follows from anything.  It might be a good idea
> for various reasons, but I certainly don't see it as anything more.

It doesn't follow from anything in this thread, other than the fact we'd
be changing the number of per-unit partitions. When this has been
discussed in the past, we decided (at Charles's correct insistance) that
we need to change the major number when we change the unit/partition
split.

> > I think the major number for the wedges should be the same as for the
> > device.
> 
> > Given that, the minor number for the whole-disk has to be one of the
> > minor numbers for the drive. :-)
> 
> Well, yes.  But it doesn't have to have a partition number.
> 
> I could imagine, for example...
> 
> sd0 at scsibus0 ....
> -> sd0 raw partition at sd major, minor 0
> -> label shows three partitions, #0, #1, #5
> sd1 at scsibus0 ....
> -> sd1 raw partition at sd major, minor 1
> -> label shows two partitions, #4, #7
> ....
> wedge covering sd0:
> partition 0 ("sd0a") at sd major, minor 2
> partition 1 ("sd0b") at sd major, minor 3
> partition 5 ("sd0f") at sd major, minor 4
> wedge covering sd1:
> partition 4 ("sd1e") at sd major, minor 5
> partition 7 ("sd1h") at sd major, minor 6
> 
> You can still access "raw" sd0 by accessing minor 0 instead of minors
> 2, 3, or 4.
> 
> Not that I'm saying this is the only, or even best, way to do wedges
> that share their major numbers with their underlying devices - indeed,
> I can see some fairly obvious potential problems with it.  But it
> serves as an example of a major-sharing scheme in which underlying
> devices can be accessed directly rather than having to appear as a
> partition.

That's a MUCH bigger change than what I thought wedges were supposed to
be. To really do this, we need a MUCH more sophisticated sytem. First off,
we have to keep state (what minor's what). Second off, single-user's a
pain. I like to be able to drop into single user, and then fsck things to
my heart's content. That'd be real hard this way as the wedge configer
would have to run first.

It would be really hard to do alternate-root boots. /dev becomes history
dependent...

> >> Remember, with wedges you're not talking about "disklabels" anymore,
> >> you're talking about some wedge-bookkeeping structure for a disk,
> >> which should contain *no* knowledge itself about what a "raw" device
> >> is.
> > The wedge-bookkeeping structure you talk about above is the same
> > thing as the in-core disklabel Leo was talking about.
> 
> Only if you insist on sharing major numbers.  I still don't see what
> that buys you, aside from complicating every disk driver with wedges
> instead of isolating the wedge layer cleanly off in its own driver.

How is it complicating every driver? They currently read the in-core
disklabel. With wedges basically being entries in an in-core disklabel
with 64 entries, it's the exact same thing we do now. In fact, since every
driver hands the partition dealing off to a subroutine, I don't think the
drivers would need to change at all. :-)

> > The one thing I didn't like about the wedge proposal is it handed a
> > lot off to the userland daemon.
> 
> This is one of the things I *do* like about it - it gets all the
> disklabel-interpreting hair out of the kernel.  (Well, most of it;
> depending on the port, it may have to have a little in order to find
> its root filesystem.)
> 
> > I'd like to be able to configure into the kernel support for a few
> > different partitioning scheme readers into the kernel.
> 
> As long as it doesn't excessively complicate the rest of it, I have no
> problem with this.  I just don't want to *have* to do that.

All these things are are subroutines which read the disk and then try to
fill in a table with it. It's only hairy now because we've hacked
different disklabel support on top of a system originally designed for
only one.

Take care,

Bill