Subject: Re: Thoughts about wedges
To: Leo Weppelman <leo@wau.mis.ah.nl>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 09/23/1999 12:22:32
On Thu, 23 Sep 1999, Leo Weppelman wrote:

> From both the proposal mail-archive that was put on the net by Frank and the
> discussion following it, I concluded/interpreted the following:
> 
>   * The kernel builds the wedge device for the root filesystem. This means
>     that at least one of the various partitioning types must be supported
>     by the kernel. (Most probably the one 'native' to the port).

Actually Charles's inintial idea didn't even include that. The boot blocks
would tell the kernel which device, an offset, and a size. But a couple of
people (myself included) really don't like the idea as we then need a
daemon to fire up in single user in order to be able to fsck more than the
root filesystem. I think being able to configure in partition support for
one or a few partition types would be fine (and if we moved to boot blocks
just passing in an offset & size, you could move to having none configured
in).

>   * All disklabel related ioctl's will be moved out of the various drivers
>     and will be either:
>           - removed (ie. moved to the compat-area) [ is this sdcompat?? ]
>           - moved to the wedge driver.

It would be compat_14 if we depreciated the ioctl's.. but see below.

>   * The wedge driver is the keeper of the wedge (partition?) info on all disk
>     devices. I envision the equivalent of what I called 'in-core disklabel'
>     to be present for each device.

Assuming we keep the unit/partition split of the minor # (we reserve 64
per drive), then the wedge info is the same thing as the "in-core
disklabel" you were talking about. :-) As such, what will the wedge driver
really do that the ioctl's dont do now?

I've scanned over a few drivers (sd.c, ccd.c, vnd.c), and for DIOCWDINFO &
DIOCSDINFO, they all do similar things. They do some preliminary error
checking (are we open for write in some, can we lock the device for vnd,
etc), mark the unit as being labeleled (VNF_LABELLING, WDF_LABELLING,
etc), then call setdisklabel. If there was no problem with setting the
label and the command's DIOCWDINFO, we then call writedisklabel.

All we really need to do is delete readdisklabel and setdisklabel as MD
routines and totally kill writedisklabel (as userland tools do this now).
We then make readdisklabel and setdisklabel the front ends to the wedge
system. We can change the names to remove disklabel if we wish. :-)

I think this'd be a good place to draw the line as all of the things
drivers do in respone to these ioctl's seem to be driver specific, and I
think it's most efficient to leave it as such.

I'd envision the wedge code as being a library, say libwedge. It would
have a read and a set entrypoint. In response to a read, it would call
different partition reading routines, the exact list of which is a compile
option. This library could/should also be used by the boot loaders when
they need to figure out partitions. Also, the partition reading code
should also be shared with userland, probably in a userland libwedge or
whatever.

The idea is that the userland version would know how to read all of the
partition types we know, and the kernel & boot blocks would know of a more
limited set of types.

Doing this we'd get to where there's one piece of code for reading a given
disk partitioning scheme, as opposed to the zoo of routines we have now.

>   * Recursive partitions are a special case of overlapping paritions.
> 
>   * It is a bit unclear when wedge info is destroyed. Must this be done
>     explicitely (per disk device, per wedge)? Will media change be detected
>     by the wedge driver?

I'd say that the thread triggered when you first started talking about
changing disklabels would cover when we want wedges to die. :-) I think
the disklabel invalidation code would do the right thing here.

>   * We have to figure out a naming sceme. We've just beaten the cXtXdXsX
>     scenario to pulp. It looks like 'we' like the driver type based sceme.
>     Whatever, I personally think that there must be a link between the
>     <driver><instance><partition> (ie. sd1a) and the wedge name.

I'll leave that to the userland daemon. As long as there's something like
a sd1a around somewhere, I'm happy.

> I mostly wrote the above to clear up my mind a bit. I hope it tells something
> useful for others too ;-)

Yep. :-)

Take care,

Bill