Subject: The devvp branch
To: None <tech-kern@netbsd.org>
From: Frank van der Linden <fvdl@wasabisystems.com>
List: tech-kern
Date: 10/02/2001 01:42:58
As you might have seen, I've been working on the thorpej-devvp branch
that Jason started a while ago.

On this branch, the dev_t argument to most device entry functions has
been replaced by a struct vnode *, pointing to the vnode associated
with the device. The exceptions are the 'dump' and 'psize' entries
for block devices, which still take a dev_t. Replacing this with
a vnode pointer is hard to do, and not really desirable at this time.

The reason for this change was that just passing a dev_t (a datatype
that someday will hopefully go away altogether, but that's for
another day) wasn't flexible enough for things like device cloning.
Device cloning basically means that you get a private instance or
'clone' of the device once you have opened it. A good example
of a device that can benefit from this is bpf; the number
of bpf devices is now limited in the kernel config file, which
is unneccesary. I will, in fact, make bpf a cloning device,
it is trivial to do so.

For example, there was no way of create any instance-specific data (per
open) for a device, a feature needed for cloning devices. It's also
a mechanism which is often used in other OSs, making kernel modules
written for other OSs tough to port (I had to deal with this for
vmware and plex86, and it's also used in the XFree86 DRI kernel modules
written for FreeBSD).

Also, currently, a lot of device code does

int
fooread(dev_t dev, ...)

	struct foo_softc *sc;
	...
	sc = device_lookup(&foo_cd, minor(dev));

Which is suboptimal. This can simply become

	
int
fooread(struct vnode *devvp, ...)

	sc = vdev_privdata(devvp);

Where vdev_privdata is a macro that is described below (really
just a field out of the specinfo structure that hangs off the vnode
structure, so it's basically a plain assignment).

Here is a description of the changes (see my commit messages to
the devvp branch for a bit more detail):

	o In struct bdevsw (and all block device drivers), change the
	  following functions to take a struct vnode * instead of a dev_t:

		o d_open
		o d_close
		o d_ioctl

	  Also add a d_flags field, which currently can only hold
	  the DF_CLONING flag to indicate a cloning device.

	o In struct cdevsw (and all character device drivers), change the
	  following functions to take a struct vnode * instead of a dev_t:

		o d_open
		o d_close
		o d_read
		o d_write
		o d_ioctl
		o d_tty
		o d_poll
		o d_mmap

	   Also add a d_flags field, same as for bdevsw.

	o Add iscloning{b,c}dev(dev_t) macros to detect a cloning
	  device given a dev_t
	o add iscloningvnode(struct vnode *) macro to detect a
	  cloning device from a vnode
	o Change the vnode calls for specfs to pass vnodes instead
	  of dev_t to the device entries mentioned.
	o Give VOP_OPEN an extra argument (struct vnode **) that will
	  contain a new (clone) vnode if the open is applied to
	  a cloning device. May be NULL if the caller isn't
	  interested in obtaining a new vnode.
	o Add an extra VCLONED flag for cloned vnodes to deal with
	  them properly. Modify functions that deal with aliases
	  to be cloned-vnode aware (basically it comes down to
	  them being on the alias hashchain, but getting a different
	  treatment).
	o Add a few extra macros/functions that make dealing with
	  instance data / cloning easier (manpage called vdev(9)
	  will be written):

		void *vdev_privdata(struct vnode *)
			Return private data associated with vnode.
		dev_t vdev_rdev(struct vnode *)
			Return dev_t for the device that vnode is
			associated with.
		void vdev_setprivdata(struct vnode *, void *)
			Associate private (instance) data with
			a vnode.
		int vdev_reassignvp(struct vnode *, dev_t dev)
			Re-associate a vnode with a new device.
			Useful when a cloning device wants to
			allocate a new minor device number for
			an instance, and wants the passed-in
			vnode to be associated with it.

I have tested kernels from the devvp branch on a few systems,
and have not seen problems for a while.

But, I'd like people to test this before it gets merged, which I hope
to do in one or two weeks from now. For example, I couldn't test
things like PPP, which is a good testcase, because it heavily
uses tty devices. In general, if you have anything that directly
talks to devices in a non-standard way, you have a good testcase.

I'm uploading some i386 GENERIC* kernels to pub/NetBSD/misc/fvdl/devvp/i386
on ftp.netbsd.org for testing; I can compile sparc and sparc64 too if
needed. If you would like to try one of those kernels out, and tell
me if it works for you (i.e. if you do not have problems with it
that plain -current is *not* giving you), that'd be much appreciated.

- Frank

-- 
Frank van der Linden                           fvdl@wasabisystems.com
======================================================================
Quality NetBSD CDs, Support & Service.   http://www.wasabisystems.com/