Subject: CFR: The Auto-Generation Block/Character Device Switch Tables
To: None <tech-kern@NetBSD.org>
From: MAEKAWA Masahide <gehenna@NetBSD.org>
List: tech-kern
Date: 05/09/2002 18:37:35
Here is a proposal for a framework to support the auto-generation of
block/character device switch tables by config(8).



   The Auto-Generation Block/Character Device Switch Tables by config(8)


1.	Background

It is too painful to maintain port-dependent conf.c, conf.h and sys/conf.h.

2.	Current Implementation

Now we have block/character device switch (bdevsw, cdevsw) tables in
sys/arch/<ARCH>/<ARCH>/conf.c as an array defined statically and
maintain by our own hands. Each entries are filled out by device interfaces
(i.e. open, close, etc...) using macros. That many macros are defined for
convenience in sys/conf.h and machine/conf.h.

Whether their entries are active is determined in compile time by NXXX
which is generated by config(8). In addition, many functions which just
return EXXX value defined in errno.h as an error are in many device drivers.

I have one basic questions.

What do you think that many macros for each devices are defined in sys/conf.h
or machine/conf.h? In addition, in order to use that macros compels us to
define many new functions same as nullop/enodev and kludge aliases by #define.
In order to just maintain sys/arch/<ARCH>/<ARCH>/conf.c by hands, what that
kludge hacks/macros are used is a Bad Thing, right?

3.	Summary

3.1.	Ideas

The framework needs to support both static and dynamic assignment of the
device majors. To realize the latter, the initial bdevsw/cdevsw tables are
generated automatically by config(8).

For kernel:
In the current implementation, each bdevsw/cdevsw entry (a set of device
interface functions such as open, close, etc.) is embedded in the device
switch definition (conf.c). Instead, it is modified to be distributed in
the corresponding device driver (backend) source as a constant data.
Because of this, when you write a new device driver, all you have to do
to the machine-dependent part is to add one line to the 'majors' file of
each port (see below) and config files.

The interface functions are not global but local in their source. They
should be always called via the device switch. It's a bad idea to call
them directly from outside of the driver. To get the device switch entry
corresponding to a specific device, devsw_lookup(9) function is introduced.
Similarly, newly added devsw_lookup_major(9) can be used to get the major
number of a specific device.

For config(8):
In order to support this feature, a new grammer 'device-major' is added to
'files'.  All 'device-major' is put within the new machine-dependent file
'majors.<ARCH>' under sys/arch/<ARCH>/conf, which is included from
'files.<ARCH>'. This is the only file which contains the device number
definitions.

To support the dynamic assignment of the device major, devsw_attach(9) and
devsw_detach(9) are added; these can be used to attach/detach the device
switch data dynamically instead of memcpy. These functions are useful for
LKM framework.

These features provide greater flexibility and make less pain to maintain
device majors and remove many macros and many functions which just return
an error, cdev_*_init, bdev_*_init and so on. We can get simple port-dependent
conf.c and sys/conf.h and machine/conf.h.

IMPORTANT:
	I DON'T merge bdevsw/cdevsw into a single structure.
	I have tried to merge them at first, but that have big
	impacts/afftects seriously. So at this time, I decided
	NOT to merge them. If necessary, we should discuss about
	this in OTHER thread. Even if conclude to merge them into a
	single structure in that discussion, my proposal affects nothing.

3.2.	Examples

3.2.1.	Kernel

Before:
	(foo.c)
		foo_open(...) {
			...
		}
	(bar.c)
		extern foo_open();

		bar() {
			...
			foo_open(...)
			...
		}
After:
	(foo.c)
		const struct cdevsw foo_cdevsw = {
			foo_open, ...
		};

		foo_open(...) {
			...
		}
	(bar.c)
		bar() {
			const struct cdevsw *cdev;

			...
			cdev = devsw_lookup(dev_t, DEVCHR);
			(*cdev->d_open)(...)
			...
		}

	If not available major numbers,

		extern const struct cdevsw foo_cdevsw;

		bar() {
			...
			(*foo_cdevsw.d_open)(...)
			...
		}

3.2.2.	Userland - config(8)

The 'fd' driver have device interfaces for block/character devices.
If 'fd' is defined in your kernel configuration file,
config(8) generates below:

- devsw.c
	extern const struct bdevsw fd_bdevsw;
	extern const struct cdevsw fd_cdevsw;

	const struct bdevsw *bdevsw0[] = {
		...
		&fd_bdevsw,
		...
	};

	const struct cdevsw *cdevsw0[] = {
		...
		&fd_cdevsw,
		...
	};

	const struct bdevsw **bdevsw = bdevsw0;
	const struct cdevsw **cdevsw = cdevsw0;

If not, each entries are filled out by NULL.

Here, fd_cdevsw/fd_bdevsw must be provided by the fd driver. So, we need to
add the definision of these data to "fd" driver source. Similarly, any other
devices have to provide their own device switches.


4.	Synopsis

4.1.	Kernel

enum devswtype { DEVBLK, DEVCHR };

DEVBLK		- Block device switch
DEVCHR		- Character device switch

4.2.	Userland - config(8)

4.2.1.	Grammer

device-major <name> char <num> [block <num>] [<options>]

name		- The prefix of bdevsw/cdevsw entry (required)
char		- A character major number (required)
block		- A block major number (optional)
options		- Conditions to determine whether must be attached or not
		  (optional)

4.2.2.	Structures and Variables

struct devm {
	struct devm	*dm_next;	/* linked list */
	const char	*dm_srcfile;	/* the name of the "majors" file */
	u_short		dm_srcline;	/* the line number */
	const char	*dm_name;	/* [bc]devsw name */
	int		dm_cmajor;	/* character major */
	int		dm_bmajor;	/* block major */
	struct nvlist	*dm_opts;	/* options */
};

struct devm *alldevms; /* list of all device-major */
struct devm **nextdevmaj; /* to construct a linked list */

struct hashtab *cdevmtab; /* character devmaj lookup */
struct hashtab *bdevmtab; /* block devmaj lookup */

int maxcdevm; /* max number of character major */
int maxbdevm; /* max number of block major */

These are only used in config(8). NOT EXPORTED TO ANYWHERE.

4.3.	Functions

4.3.1.	Kernel

	const void *devsw_lookup(dev_t dev, enum devswtype type);
	int devsw_lookup_major(const void *devsw, enum devswtype type);
	dev_t devsw_chr2blk_dev(dev_t chrdev);
	dev_t devsw_blk2chr_dev(dev_t blkdev);

	int devsw_attach(const void *devsw, int maj, enum devswtype type);
	void devsw_detach(const void *devsw, enum devswtype type);

4.3.2.	Userland - config(8)

	int adddevm(const char *name, int cmaj, int bmaj, struct nvlist *opts);
	int mkdevsw(void);
	int fixdevm(void);


5.	Description

5.1.	New Functionality

5.1.1.	Kernel


const void *devsw_lookup(dev_t dev, enum devswtype type);

Get a device switch associated with the dev_t 'dev' and the device switch
type 'type'. The 'type' determines which device switches to be looked up.
In the internal of this function, get the major number from 'dev' by using
major() macro. Return the device switch on success. Otherwise, return NULL.


int devsw_lookup_major(const void *devsw, enum devswtype type);

Get a device major number associated with the device switch 'devsw' and
the device switch type 'type'. The 'type' determines which device switches
to be looked up. Return the device switch on success. Otherwise, return NULL.


dev_t devsw_chr2blk_dev(dev_t chrdev);

Convert from character dev_t to block dev_t.
Return the valid dev_t (!= NODEV) on success. Otherwise return NODEV.


dev_t devsw_blk2chr_dev(dev_t blkdev);

Convert from block dev_t to character dev_t.
Return the valid dev_t (!= NODEV) on success. Otherwise return NODEV.


int devsw_attach(const void *devsw, int *maj, enum devswtype type);

Attach a device switch 'devsw' associated with the major number '*maj'
and the device switch type 'type'. If '*maj' is -1, allocate a major number
dynamically and stored allocated number in '*maj'. Return 0 on success or
an error value.


void devsw_detach(const void *devsw, enum devswtype type);

Detach a device switch 'devsw' associated with the device switch type 'type'.

5.1.2.	Userland - config(8)

These functions are used in config(8) ONLY.

int adddevm(const char *name, int cmaj, int bmaj, struct nvlist *opts);

Make a list entry of 'alldevms' which is associated with the name 'name'
and character major number 'cmaj' and block major number 'bmaj' and the
options 'opts'. The options are used to determine whether this device switch
must be attached or not in fixdevm().


int mkdevsw(FILE *fp);

Generate initial bdevsw/cdevsw tables, nbdevsw, ncdevsw, swapdev and mem_no.

nbdevsw	- # of bdevsw (i.e. initial bdevsw table size)
ncdevsw	- # of cdevsw (i.e. initial cdevsw table size)
swapdev	- a fake swap device
mem_no	- a memory device character major number


int fixdevm(void);

Determine which device switch must be attached.


6. Compatibility

These changes break a compatibility of LM_DT_BLOCK/LM_DT_CHAR in LKM framework.
(This feature is used to attach the device switch in run-time.)

In the current implementation, each LKM drivers search the reserved entry for
LKM directly and change device switch tables by using memcpy().
The device switches are defined in each kernel module sources.

In this proposal, all device switches must be defined in original driver
source for static-linked kernel. So we don't need to define a device switch
in kernel module source. And we don't need to lookup device switch tables
directly to attach the device switch. Just use devsw_attach(9).

If try to load the some old LKM drivers withour any hacks for LKM,
we can see the terrible disaster. To protect the kernel from the disaster,
bump the LKM_VERSION in sys/lkm.h and MUST re-make the LKM drivers.

Fortunately, this LKM feature (LM_DT_BLOCK/LM_DT_CHAR) is used by only
ipfilter(4) in current NetBSD tree. It has already been rewritten,
but not tested yet.

The iwm_fd driver on NetBSD/mac68k also changes device switch tables.
But this driver chages tables directly by using memcpy() with hard-coded
device majors, not using this LKM feature. There is no good solutions for this.
If want to attach the device switches to the kernel, we should use the
LM_DT_BLOCK/LM_DT_CHAR feature and free from hard-coded device majors.


7.	Implementation

All features have already been implemented for ALL PORTS!!!
The latest patch kit is available at:

http://gehenna.as.wakwak.ne.jp/dev/devsw-20020508.diff.bz2

This patch kit is based on the syssrc source "2002/04/23 00:00:00 JST".