tech-kern: Re: representation of persistent device status, was Re: devfs, was Re: ptyfs...

Subject: Re: representation of persistent device status, was Re: devfs, was Re: ptyfs...
To: Daniel Carosone <dan@geek.com.au>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 11/19/2004 13:03:03
--4SFOXa2GPu3tIq4H
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Nov 19, 2004 at 01:25:50PM +1100, Daniel Carosone wrote:
> On the issue of namespaces, in one view of a "full devfs", even having
> major,minor numbers as a namespace at all should go away, or at least
> become a historical distraction.  The device *is* the vnode in the
> devfs.

The problem is that pointers into KVA don't make good long-term labels for=
=20
storing permissions. :-)

> There was a suggestion recently in a thread about cloning devices that
> illustrates this, re making specfs change the major, minor numbers in
> the returned vnode.  The problem with that, at least for the ptmx
> case, is that having the cloned nodes in that namespace at all
> potentially exposes various aliasing-type problems where another
> normal dev node is created with the same numbers elsewhere.

Cloning doesn't do that, we have that aliasing issue now. :-|

Also, once we have devfs up & running & we're happy with it, I expect we=20
will probably depreciate non-devfs device nodes. There may be a full=20
release of NetBSD between our first devfs in -current and no more=20
other-device-nodes... :-)

> It's almost certainly my own lack of understanding, but I've also
> never been entirely clear and confident that various kinds of locking
> done in the vnode vs driver layer for some devices is robust against
> such aliasing.

Given that we have a single-threaded kernel, the locking's fine. You are
right that when we move to a file-grained approach we will probably want
to do something different. Though since the driver will need device=20
locking at that point, we will still be ok; the device lock and not the=20
vnode lock is the key one.

> If the only way to refer to the device (or other "object") is via the
> open handle, many of these potential confusions are eliminated.  Where
> there's an explicit administrative need to get at them via another
> path, one can be provided explicity - perhaps as in the case of audio
> vs audioctl nodes, and their different opening behaviour.

The problme I see with that is that we have a number of interfaces that=20
need to do i/o that don't have an "open handle" (I assume you are refering=
=20
to an open file descriptor?). They are all kernel-internal, but they are=20
significant. Like how a disk file system talks to its backing device.

> As for persistence of permissions and other manual settings, I have
> yet to see a strong example that can't be solved with something like
> an mtree specfile at boot, perhaps combined with a kevent-like
> notification to a daemon when new devices appear (perhaps they appear
> with 000 by default, to avoid races?)

So? :-)

Yes, we could do something like that, and it'd about half-work. But I=20
think it'd be sub-optimal for a number of reasons.

1) Device access is the kernel's job, yet in this example userland is=20
responsible for controling it. Yes, you've closed a race hole (and I don't=
=20
see any others yet), but userland still has to do it. And we have to trust=
=20
userland to do it. One of the things I loved about the flags system (a la=
=20
chflags) is that we can not trust root. Yet here we have to trust a root=20
daemon to set things up right (and most importantly not to permit more=20
access than desired).

2) We have a multi-step creation process. We create some nodes, then we=20
look at the list and update them. Say we have a node whose name is the=20
default node name of a different node. Like I want what we'd call block=20
dev 4, 19 on i386 to be named "sd0d". Yes, normally we'd call that thing=20
"sd2d". And I agree this is a bit of an abomination. But we can do it now,=
=20
and so I think we should be able to do it with devfs. And so we have to be=
=20
careful about how exactly this "mtree" list gets applied, so much so I'm=20
not sure if mtree is really still helpful.

I don't really see what the advantage of a userland daemon is. With it
around, we end up with a table of data in the kernel, in the daemon, and
in the file on disk. I don't really see what the daemon buys us. A kernel
thread that has direct access to the kernel table will do whatever we want
the daemon to do.

> When there's a master process causing a dev node to appear, as in the
> pts case or as with pipes and unix sockets, that process should be
> able to specify the perms.
>=20
> I did, long ago, suggest the unionfs-mounted-under-/dev trick, such as
> for holding convenience symlinks and similar things.  I can't help the
> feeling that this discussion is making to much of this issue. The
> solution needs to be right, but part of being right is being simple.

I think that is an independent question. A unionfs mount for files &=20
symlinks may be a good thing too, but it's independent of how we manage=20
the device node info.

Take care,

Bill

--4SFOXa2GPu3tIq4H
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFBnl+HWz+3JHUci9cRAjpDAJ9ecPeVrRHb6cDwnE6+1BB3d1pBAwCeNNei
Ee/iqt4FwR4j9vDEgVjURwQ=
=gB7v
-----END PGP SIGNATURE-----

--4SFOXa2GPu3tIq4H--