Subject: Re: Cloning bdev/cdev devices, step one
To: Chuck Silvers <chuq@chuq.com>
From: Bill Studenmund <wrstuden@zembu.com>
List: tech-kern
Date: 07/07/2000 13:33:26
On Thu, 6 Jul 2000, Chuck Silvers wrote:

> this sounds like a fine thing to me.  making devices less tied to
> dev_t internally in the kernel is a good thing, since dev_t is really
> just a kludgy way of representing a device in a filesystem.

Uhm, how is it kludgy? Hasn't it been that all unix has needed to
represent/specify a device has been is the dev_t? I mean, hasn't it been
that that's been the canonical specifier? :-) If dev_t is the canonical
specifier, how is using dev_t in the kernle a kludge? :-)

> I wonder thought, is there any value in providing the glue to be able
> to refer to the cookie as a vnode field?  v_rdev looks to be there
> more for backward compatibility than anything else, but this is something

It's been there since the initial import from Berkeley, so we'd need their
source control logs to see the real history. But I think it means
"real" device, as opposed to the dev_t on disk.

99.9% of the time, the two are the same. But if we go to 64 partitions per
disk (using new major numbers), then they won't necessarily be the
same. I expect that what we will do is map the 8/16 partition devices to
the equivalent 64 partition device. So for things like sd0a, the inode
will still have the 8/16 partition major, while the vnode's v_rdev will
have the 64-partition major number. That way the right thing happens if
there's an sd0a on disk with the 64-partition major number.

Also, I suspect that the /dev/console vnode used to have the real console
device's dev_t shoved into its vnode. Nowadays, our console driver just
hangs onto a vnode with the right dev_t in it. Note: this is all done so
that sys_revoke() works right.

> new so there is no previous name to be compatible with.  the same would
> have held for the other v_ aliases defined along with v_rdev, but I guess
> whoever was doing that was trying to be consistent.  my take on that
> is that it's confusing the namespaces of vnodes vs. devices and it would be
> better to not pretend those are vnode fields, but that's pretty subjective.

I've dug into all of this fairly deply, to get layered device nodes to
work. And it does make sense. :-) Think of them as vnode fields which are
only valid if you have a device node. Anything in the vfs systems which
sees a character or block vnode knows that these fields are there. They
are also vnode fields in that they are a public interface to the node (as
opposed to the fs-specific private stuff).

:-)

Note: if we do put the cookie in the vnode, I think it should go in struct
specinfo, and get a v_devcookie define too. Mainly because it helps memory
scaling. We only need these fields for devices. With them in struct
specinfo, we only allocate space for them for each seen device. If we put
it in struct vnode, then we allocate that space always. I haven't done
counts, but I expect most systems to have a LOT more vnodes than device
vnodes. :-)

> perhaps it would be useful to sketch out an example of how this scheme would
> work so that the details would be clearer.  I know that matt thomas was
> advocating a different scheme (and perhaps other people have other ideas),
> and if we had examples of how each of them would accomplish their goals
> we'd have a better notion of their benefits.

I'm awaiting Jason's sketch too. I really like the idea of being able to
add ccd's on the fly (the part shown so far). I'm a bit worried about
swapping out vnodes in upper layers (since we'd have to either special
case certain device major numbers, or we'd have to be passing struct
vnode ** into VOP_IOCTL() so the device could do it), but it might work
well.

Take care,

Bill