Subject: Re: 32 bit dev_t
To: Charles M. Hannum <mycroft@mit.edu>
From: Todd Vierling <tv@NetBSD.ORG>
List: tech-kern
Date: 01/13/1998 09:48:23
On Tue, 13 Jan 1998, Charles M. Hannum wrote:
: * As an alternative to renumbering, you could instead make the minor
: number non-contiguous; i.e.:
:
: 3322222222221111111111
: 10987654321098765432109876543210
: |--minor---||--major---||minor-| new
: |major-||minor-| old
:
: * Given the above scheme, should we decide to renumber everything at
: some point (hopefully only once!!!), we can simply choose some portion
: of the major numbers (say, the first 256) and use them for
: `compatibility'. There's no need to waste half the number space.
Well, you don't waste `half' in reserving `new' major number 0--you only
lose 1,048,576 out of 4,294,967,296 possible device numbers. But there's a
big problem with splitting the numbers this way: kernel overhead.
Disassembling and reassembling pieces of ints turns a bitand-and-shiftright
into a (bitand)-bitor-(bitand-and-shiftright). Not that this is too bad,
but it's more than it is worth, particularly if you need to look at a
hexadecimal dev_t in kernel data dumps and figure out what it is. Why
bother, if the dev_t's will be renumbered anyway?
(On a sidenote, I now remember why a 12/20 split Felt Right as opposed to
10/22 and 14/18. In a 12/20, you can take a dev_t as a hex number and pull
major and minor out of it; 3 digits major, 5 digits minor.)
: * If you're going to change all the existing device numbers, rather
: than turning major() and minor() into functions, you should do any
: old->new conversion (if necessary) in checkalias(), before it's put in
: v_specinfo. This value is *only* used internally to the kernel.
Well, only internally to the kernel, but not necessarily in just one place.
And it's not the only place that cares about the value of a dev_t. Problem
is, as I see it, if I were to convert the value before it gets put into an
info structure, then __stat13() would break--it would get the (converted)
value of a dev_t, which is *not* what we want. We want __stat13() to get an
opaque value, and as long as the dev_t's are never converted in storage
(only in kernel comparisons), things like finding a tty won't break.
: (Think about what you're proposing doing to your serial port read()
: path, for example. A couple of calls to major(), one to minor(), etc.
: How many more layers of indirection before you make it slow enough
: that nobody wants to use it any more?)
Let me quote my /sys/sys/types.h:
typedef u_int32_t dev_t; /* device number */
[...]
#if defined(_KERNEL) && defined(COMPAT_13)
extern const dev_t __devcvt32 __P((dev_t)); /* convert major _and_ minor */
extern const u_int __devcvt32maj __P((dev_t)); /* convert major (faster) */
#define devcvt32(x) ((dev_t)(x) & 0xfff00000 ? (dev_t)(x) : __devcvt32(x))
#define major(x) ((u_int)(x) & 0xfff00000 ? \
((u_int)(x) >> 20) & 0xfff : __devcvt32maj(x))
#define minor(x) ((u_int)(devcvt32(x) & 0xfffff))
#define makedev(x,y) (((((dev_t)(x) & 0xfff) << 20) | ((dev_t)(y) & 0xfffff)))
#define devcmp(x,y) (devcvt32(x) != devcvt32(y))
[...]
You'll note that I added inline checks for `new' and `old' devices in
major(), minor(), devcvt32(), and devcmp(). If a device is `new', a
function call is never made. Better yet, these all drop out if COMPAT_13 is
undefined.
The devcvt32() interface (without the underscores) exists only as a backend
for generating hash seed values that are the same for 16 and 32 bit devices,
and is used nowhere in my code modifications to convert a value before it
gets stored in some structure.
: * There's no reason to worry about the size of cdevsw[]. If nothing
: else, we can do something we arguably should have done years ago: put
: the individual cdevsw[] entries in the drivers themselves and just
: have cdevsw[] be an array of pointers. The entry points aren't really
: anyone's business but the driver anyway. (Note that I'm *not*
: suggesting we have the driver `register' it. HTF do you tell the
: driver to do that, and where would it get its major number from,
: anyway?)
For now, I'm going with the hybrid suggestion of thorpej and cgd: a
${MACHINE_ARCH} global table of devices, that is _not_ shared across All
Ports. That maximizes compatibility and minimizes table size.
: * We can't have stat(2) or mknod(2) magically change device numbers
: behind our back. For example, consider using pax(1) to pack or unpack
: a file system used by another operating system. Total lossage.
Well, that will automagically happen if the specinfo struct gets changed,
won't it? I don't want any changing of numbers going on at all... that
point was already made.
But, if you're unpacking device nodes for a different system, and you're
going to use them locally instead of exported, you're going to lose. :>
: Ignoring funky interfaces to mknod(8), you could do the entire change
: to 12/20 device numbers by touching <20 lines of code. Why add hair
: where you don't need it?
Well, ever since `Rev. 3' of my proposal, that's about all it will take.
I've been working on it already, with the current __devcvt32() just a stub
that keeps device major and minor the same. After that code works correctly
under heavy tests, I'll start adding a renumber table to sparc as a first
shot....
=====
===== Todd Vierling (Personal tv@pobox.com) =====
== "There's a myth that there is a scarcity of justice to go around, so
== that if we extend justice to 'those people,' it will somehow erode the
== quality of justice everyone else receives." -- Maria Price