Subject: devfs again, was Re: Keeping obsolete device numbers? (was: CVS commit: src/sys/conf)
To: Garrett D'Amore <garrett_damore@tadpole.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 02/27/2006 17:45:03
--7gGkHNMELEOhSGF6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Feb 27, 2006 at 04:09:27PM -0800, Garrett D'Amore wrote:
> This problem really says that we need a real devfs that doesn't suffer
> this problem.  Admins should not have to worry about major numbers.
>=20
> In fact, major numbers should be totally dynamically allocated, and
> never ever used from anything in userland.
>=20
> The fact that we still use statically assigned major numbers is one of
> the areas where we're still using 1970s technology, and its an inferior
> solution.

Yes, thank you. Please review the archives, we have hashed devfs around a=
=20
number of times.

This note has the feel of a lecture. Please don't. I think the list has
seen the issue hashed out strongly in the past, and is quite aware of the
issues. A number of steps are in progress to help us get to a devfs,
though the end-step (the devfs file system) isn't there yet.

> If you think about device nodes, you really have a few device properties
> that are interesting:
>=20
> 1) device path (this would be Solaris' /devices path), this is a fixed
> quantity that is determined by the hardware, is independent of probe
> order, and generally never changes -- it will include things like SCSI
> IDs or WWNs, PCI bus/device/function information, etc.

The concept here we've batted around is that of locators. And extending=20
the "locators" idea to include WWNs and iSCSI target IDs and such.

> 2) device driver name
>=20
> 3) device "instance"  (e.g. is it com0 or com1)
>=20
> 4) device minor number (e.g. call-out vs. direct attached serial lines)
>=20
> 5) the "class" of device, e.g. "ethernet" vs "serial port" vs "scsi hba"
>=20
> The various /dev/ nodes are just friendly names to get to these.  What
> we need from a devfs is:
>=20
>     1) a way to record for a given device path (#1 above) a persistent
> "instance" number (see /etc/path_to_inst on Solaris)
>=20
>     2) a way to figure out a reasonable /dev/ path name for a given
> driver/instance/minor
>=20
>     3) physically exporting the /dev/ path via special character/block
> devices (e.g. a special devfs filesystem)

You forgot the one that seems to trip us up:

    4) Persistent ACL storage. For now, this is owner/group/chmod flags.

> There are some tricky bits, especially with hotplug devices like
> PCMCIA.  E.g. what happens when you plug a NIC that is identical in to a
> slot, and that NIC is identical in all respects to the previous occupant
> of the slot, except it has a different MAC address.

NICs don't have /dev nodes.

The feel I have from the last time this was hashed out was that it'd=20
depend on how the device got wired down. If it was not wired down, then=20
it'd be eligible for the same ID. If it had been wired down in a way that=
=20
could be differentiated (MAC for NIC, if we ever added /dev nodes for=20
NICs), then it'd get a different ID. If it were wired down in a=20
non-differentiatable way (slot foo), then it'd get the old ID.

> The other surprising result that could catch people off guard is that
> e.g. in the first PCMCIA slot the device is known as "ne0" but in the
> second slot it become "ne1", even if slot 0 is not occupied.

Would depend on the wirings.

> One possible snafu, is chroot()'d filesystems, where you only want to
> expose device nodes for certain "safe" devices.  Maybe achieve this with
> some kind of autofs or unionfs hackery?

This point has been hashed out vigorously. devfs MUST support multiple=20
instances from the get-go, and each instance must be independently=20
configurable. Thus chroot's work right.

> To make devfs in general happen, as I see it, we need the following:
>=20
> config_attach_sm_loc() needs to have logic to assign a major number for
> a driver if one is not already assigned.  This needs to go into some
> master hash table in the kernel (or resizeable array), protected by a loc=
k.

Major numbers aren't the real pain that devfs fixes, so I suggest waiting=
=20
on this part.

Let me rephrase that. Ignore major numbers for now. We have more than=20
enough of them; they aren't what needs fixing now. A devfs with the exact=
=20
same major number assignment system we have now will be quite a good=20
thing.

The hard part is getting the name right. Once you get that, you'll have
done most of it. Further, assuming the configuration uses names instead of
numbers for device driver identification (which is easy to decide to do
now), this major number change will be trivial later.

> each driver needs to properly publish either a device class (from which
> a standard driver/instance -> /dev/ path list can be drived), or export
> a subroutine to provide the mapping dynamically.
>=20
> Interestingly enough, for standard kinds of classes, a real devfs like
> this could export /dev nodes that are generic, rather than having the
> driver name in them.
>=20
> E.g. you could have "/dev/eth0" or somesuch instead of "/dev/ne0".

Uhm, we could. We actually can do that now, to an extent, with MAKEDEV
games. However I'm not sure we really want to. I personally prefer the
naming we have.

> To get to logical names though you'd need another persistent table,
> containing the logic name & instance mapping to the driver name/instance
> number.

Yep, see above and archives. The best idea we have come up with so far is=
=20
either to have a daemon or have a kernel-readable config file that handles=
=20
things.

> Btw, when I say "instance" above, I think I'm referring to the dv_unit
> member of struct device.  I'm just used to the Solaris naming
> convention. :-)
>=20
> So, is anyone working on this?  Anyone *not* want something like this in
> NetBSD?

I'm not sure if anyone is directly working on this, but the wedges work=20
was a start, as we would then not have disk partitions in the mess. See=20
the archives regarding what we do and don't want. In general, we want a=20
devfs, we however want to avoid mistakes we can envision now.

Take care,

Bill

--7gGkHNMELEOhSGF6
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFEA6sfWz+3JHUci9cRApDjAJ9//6syXQKbaZ5RIogEM5cM1nBl7ACdGcMD
qfZNtY3kd1dB+d3fCp55FCk=
=Oul3
-----END PGP SIGNATURE-----

--7gGkHNMELEOhSGF6--