Subject: Re: 32 bit dev_t
To: Charles M. Hannum <mycroft@mit.edu>
From: Todd Vierling <tv@NetBSD.ORG>
List: tech-kern
Date: 01/13/1998 09:48:23
On Tue, 13 Jan 1998, Charles M. Hannum wrote:

: * As an alternative to renumbering, you could instead make the minor
: number non-contiguous; i.e.:
: 
:     3322222222221111111111
:     10987654321098765432109876543210
:     |--minor---||--major---||minor-|    new
:                     |major-||minor-|    old
: 
: * Given the above scheme, should we decide to renumber everything at
: some point (hopefully only once!!!), we can simply choose some portion
: of the major numbers (say, the first 256) and use them for
: `compatibility'.  There's no need to waste half the number space.

Well, you don't waste `half' in reserving `new' major number 0--you only
lose 1,048,576 out of 4,294,967,296 possible device numbers.  But there's a
big problem with splitting the numbers this way:  kernel overhead. 
Disassembling and reassembling pieces of ints turns a bitand-and-shiftright
into a (bitand)-bitor-(bitand-and-shiftright).  Not that this is too bad,
but it's more than it is worth, particularly if you need to look at a
hexadecimal dev_t in kernel data dumps and figure out what it is.  Why
bother, if the dev_t's will be renumbered anyway?

(On a sidenote, I now remember why a 12/20 split Felt Right as opposed to
10/22 and 14/18.  In a 12/20, you can take a dev_t as a hex number and pull
major and minor out of it; 3 digits major, 5 digits minor.)

: * If you're going to change all the existing device numbers, rather
: than turning major() and minor() into functions, you should do any
: old->new conversion (if necessary) in checkalias(), before it's put in
: v_specinfo.  This value is *only* used internally to the kernel.

Well, only internally to the kernel, but not necessarily in just one place. 
And it's not the only place that cares about the value of a dev_t.  Problem
is, as I see it, if I were to convert the value before it gets put into an
info structure, then __stat13() would break--it would get the (converted)
value of a dev_t, which is *not* what we want.  We want __stat13() to get an
opaque value, and as long as the dev_t's are never converted in storage
(only in kernel comparisons), things like finding a tty won't break. 

: (Think about what you're proposing doing to your serial port read()
: path, for example.  A couple of calls to major(), one to minor(), etc.
: How many more layers of indirection before you make it slow enough
: that nobody wants to use it any more?)

Let me quote my /sys/sys/types.h:

typedef u_int32_t       dev_t;          /* device number */
[...]
#if     defined(_KERNEL) && defined(COMPAT_13)
extern  const dev_t __devcvt32 __P((dev_t));    /* convert major _and_ minor */
extern  const u_int __devcvt32maj __P((dev_t)); /* convert major (faster) */
#define devcvt32(x)  ((dev_t)(x) & 0xfff00000 ? (dev_t)(x) : __devcvt32(x))
#define major(x)     ((u_int)(x) & 0xfff00000 ? \
                      ((u_int)(x) >> 20) & 0xfff : __devcvt32maj(x))
#define minor(x)     ((u_int)(devcvt32(x) & 0xfffff))
#define makedev(x,y) (((((dev_t)(x) & 0xfff) << 20) | ((dev_t)(y) & 0xfffff)))
#define devcmp(x,y)  (devcvt32(x) != devcvt32(y))
[...]

You'll note that I added inline checks for `new' and `old' devices in
major(), minor(), devcvt32(), and devcmp().  If a device is `new', a
function call is never made.  Better yet, these all drop out if COMPAT_13 is
undefined.

The devcvt32() interface (without the underscores) exists only as a backend
for generating hash seed values that are the same for 16 and 32 bit devices,
and is used nowhere in my code modifications to convert a value before it
gets stored in some structure. 

: * There's no reason to worry about the size of cdevsw[].  If nothing
: else, we can do something we arguably should have done years ago: put
: the individual cdevsw[] entries in the drivers themselves and just
: have cdevsw[] be an array of pointers.  The entry points aren't really
: anyone's business but the driver anyway.  (Note that I'm *not*
: suggesting we have the driver `register' it.  HTF do you tell the
: driver to do that, and where would it get its major number from,
: anyway?)

For now, I'm going with the hybrid suggestion of thorpej and cgd: a
${MACHINE_ARCH} global table of devices, that is _not_ shared across All
Ports.  That maximizes compatibility and minimizes table size.

: * We can't have stat(2) or mknod(2) magically change device numbers
: behind our back.  For example, consider using pax(1) to pack or unpack
: a file system used by another operating system.  Total lossage.

Well, that will automagically happen if the specinfo struct gets changed,
won't it?  I don't want any changing of numbers going on at all... that
point was already made.

But, if you're unpacking device nodes for a different system, and you're
going to use them locally instead of exported, you're going to lose.  :>

: Ignoring funky interfaces to mknod(8), you could do the entire change
: to 12/20 device numbers by touching <20 lines of code.  Why add hair
: where you don't need it?

Well, ever since `Rev. 3' of my proposal, that's about all it will take.
I've been working on it already, with the current __devcvt32() just a stub
that keeps device major and minor the same.  After that code works correctly
under heavy tests, I'll start adding a renumber table to sparc as a first
shot....

=====
===== Todd Vierling (Personal tv@pobox.com) =====
== "There's a myth that there is a scarcity of justice to go around, so
== that if we extend justice to 'those people,' it will somehow erode the
== quality of justice everyone else receives."  -- Maria Price