Subject: Re: partition bookkeeping
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Greywolf <greywolf@starwolf.com>
List: tech-kern
Date: 09/22/1999 16:25:08
On Wed, 22 Sep 1999, der Mouse wrote:
[including article from Bill Studenmund with '> ']
# > That's why we wont' hard-code things. It'd waste too much space,
# > because you have to code for the max value when doing so, and all the
# > cases which don't use anything near the max (like how most SCSI
# > devices only use LUN 0) will just waste resources.
#
# This is one of the arguments in favor of devfs (at least, those who
# like c0t0s0 names might think so): it makes this possible. There's no
# reason you couldn't have minors allocated only for those devices that
# actually exist; as long as /dev/dsk/dks0d2s5 gets partition 5 on device
# 2 on controller 0, nobody cares whether its minor number is anything in
# particular.
A devfs is completely extricate of the naming convention. Names are
for the humanoid types.
If we move all the disklabel out of the kernel, it is worth asking
the question of whether or not the minor number will ever be used,
since in the device driver, the minor number will show the disk number
(say in all the bits above the fifth) and the partition to reference
on that disk (on bits 0-4 [22 partitions overflows four bits).
If the kernel has no disklabel/offset information, how does the
driver get the bus/controller/unit/offset information?
Let me guess, using fsck as an example:
fsck gets "/dev/disk/rsd2d" as its device
fsck calls open("/dev/disk/rsd2d") CS #1
Eventually the vnode gets resolved, and
the open routine for the vnode goes
out to userland to talk to this daemon
which handles all the device mappings CS #2
The daemon runs and returns the information
to the vopen (or whatever) CS #3
The filehandle finally gets returned to
fsck. CS #4
Congratulations. We're now doing twice as many context switches
as we really need to. Granted, how often will we be needing to
do this (not too often, I'd surmise); nonetheless, will this be
a trend to move things out of the kernel and into userland?
Microkernels don't work -- context switching is not cheap.
And what, pray tell, do we do if this daemon happens to get
corrupted on the disk? We cannot fsck beyond the root
filesystem in single-user mode. (Never mind that there is
probably more wrong if the daemon is zorched.) We now
have to restore from tape (we do have tape, right? Oh, my.
No, we don't because we're doing dynamic device allocation which
needs that daemon. Or is this just for disks?) just to
run fsck on our filesystems.
I don't see that we really consume all that much space by
mapping the partition information into kernel space:
It's, what, a dev_t, a length and an offset plus some flags?
I'm obviously missing something here because on the surface, such
a dynamic scheme actually does look cool (as long as we don't go
to the graveyard that System V made!). But how realistic is this?
#
# der Mouse
#
# mouse@rodents.montreal.qc.ca
# 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
#
--*greywolf;
--
Microsoft:
"Just click on the START button and your journey to the Dark Side
will be complete!"