Subject: A new partition handling scheme: wedges
To: None <tech-kern@NetBSD.ORG>
From: Charles M. Hannum <mycroft@mit.edu>
List: tech-kern
Date: 01/25/1998 19:06:44
Here I propose a new way of handling partitions in NetBSD.  This
proposal is aimed at addressing some problems that have not been
addressed in previous proposals or implementations (including our
current scheme and `slices').  I call the idea `wedges'.


The goals of this proposal are to:

* allow all partitioning schemes to work on all architectures,

* remove as much knowledge of partitioning from the kernel as
poosible,

* easily support recursive partitioning (e.g. a disk image with its
own partitions within a partition of a larger disk, for creating CD
images),

* remove any handling of `partition' numbers from device drivers.


The proposal is thus:

* Remove all knowledge of partitioning from device drivers, and have
them deal only with flat disks.

* Introduce a new block device, which I refer to as a `wedge', which
may be configured (using an ioctl) to transpose strategy calls onto a
particular section of another block device.

* Introduce a utility to configure wedge devices.  In addition to
supporting direct configuration, this utility will know how to read
all types of partitioning information that we support and
automatically create multiple wedges based on this.  (More details on
this below.)

* When remounting a file system, if the block device numbers mismatch,
check whether they are two block devices pointing at the same wedge.
Automatically create a wedge for the root file system based on
information from the boot block, with a static name in the file system
(say `/dev/root').  (This deals with booting.  It also allows
renumbering of wedges if you want -- something I was always annoyed
that I couldn't do with labels.  It also has the interesting side
benefit that you can remount / read-write without needing a device
node for the particular disk you booted from.)

* When opening a wedge, check for other overlapping wedges that are
open.  (This is effectively the same as the old partition overlap
check.)

* When configuring a wedge, check to see if it's already in use.
(There are a couple of ways to deal with this, actually.  One is to
allow a wedge to be opened before it's configured and simply return
EBUSY on successive opens.  Another is to have separate `I/O' and
`control' nodes for each wedge, and check whether the I/O node is open
before allowing configuration.  The latter would allow us to open the
control node and query it even while the wedge is in use, which would
be winning.)

* For legacy installations, have a separate pseudo-device
(e.g. `sdcompat') which knows how to read old `labels' from an
underlying disk and automatically creates and destroys wedges on open
and close of a `partition'.  This will allow a user to use old device
names and device nodes, but prevents any of the implementation from
leaking into the (new, simplified) disk drivers.  It can also be used
to transparently deal with old boot programs that pass in partition
numbers (rather than wedge information), while still allowing the rest
of the system to use wedges natively.


I envision the `wedgeconfig' utility having a syntax something like
the following.  (Note that I'm not married to this UI.  It's provided
only as an example of what you'll be able to do.)

# wedgeconfig /dev/wedge0 /dev/sd0 32768 65535

Makes /dev/wedge0 transpose requests to /dev/sd0 blocks 32768-65535.

# wedgeconfig -L /dev/sd0 compat /dev/sd0a /dev/sd0b /dev/sd0c ...

Configures wedges for each partition listed in the `compat'-type label
on /dev/sd0, and creates alternate names for them.  (Depending on the
existing configuration, an essentially random wedge device will be
chosen for each partition, so the alternate names will have to be
recreated.  They could either be symlinks or actual device nodes; if
the latter, we can preserve time and ownership information.)

# wedgeconfig -L /dev/sd0 auto ...

Like the previous, but checks for multiple types of partitioning info
automatically.  This is what most people should be able to use.

# wedgeconfig -a

Configures wedges for all attached disks using automatic label
detection and a default set of names (essentially the disk device name
with a `partition' letter attached).  (Or maybe it should read a
configuration file instead.)


Random notes:

* When remounting to a different wedge, all the cached blocks for the
old wedge will now be used by the new wedge.  To make this work,
either the blocks need to be cached by the underlying device driver,
or the block descriptors (struct bufs) need to be modified.

* We need an efficient way of finding the next unused wedge.  (Can you
say `cloning devices'?  With control/I/O nodes, it's very similar to
the pty master/slave model, except that a wedge needs to remain
configured even while the control and I/O nodes are both closed.)

* I'm sure some people will find this objectionable.  B-)