tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: (Semi-random) thoughts on device tree structure and devfs

On Thu, Mar 11, 2010 at 10:52:53PM -0500, der Mouse wrote:
 > > (1) dev_t cannot go away, because a fairly fundamental guarantee in
 > > Unix is that two files are the same if stat returns the same (st_dev,
 > > st_ino) pair for each.
 > This dev_t does not have to correspond, though, to anything else in the
 > system.

Not really, no, but it may as well be the same as what's in st_rdev.

 > > (3) It is also necessary that device nodes continue to appear as
 > > device nodes to stat (S_IFBLK, S_IFCHR, etc.)
 > No, actually.  See below.
 > > because assorted regrettable things happen if e.g. disk partitions
 > > appear to be regular files.
 > Oh, they probably shouldn't appear to be ordinary files.  (I'm not
 > convinced they can't be; those "regrettable things" could be looked
 > upon as things needing fixing upon switching paradigms.)  

In the best case it's like when naive Linux users first encounter
/proc/kcore. The biggest obvious real problem is that you'll probably
end up with an extra copy of each disk on your backup tapes. You also
get programs that know to avoid device nodes tripping on various
special semantic properties some devices have, like blocking for
carrier opening ttys or rewinding tapes.

This issues could probably be fixed with attributes of some kind, but
"I'm a device" is after all exactly the right attribute...

Anyhow, I tried it and the other guys on the project made me revert :-)

 > procfs and kernfs are examples of filesystems which illustrate that
 > it's possible to have a non-"device" entities in the filesystem which,
 > when opened, connect to specialized code.

Oh sure, and sometime I should write up VINO's kernfs too (it was not
a failure) but these work out somewhat differently in practice. The
files in procfs and kernfs are for the most part semantically
equivalent to real files even when they're virtual or dynamically
generated. Devices frequently have other properties.

 > Doing this with a devfs might even involve creating a new type of
 > filesystem entity (S_IFDEV, say), though that's quite possibly not
 > necessary.

I don't see any point at all in renaming S_IFBLK/S_IFCHR. Having two
types of devices is not necessarily useful, but it's not so harmful
that it's worth changing around, and any new device type would have
pretty much the same semantics anyway.

 > > [vino] did point out at least two important points in addition to the
 > > ones above.
 > > (1) Attaching a device into devfs and attaching a fs into the fs
 > > namespace are fundamentally the same operation.
 > Only at a very general level, the level of "new stuff appearing in the
 > filesystem", but at that level open(,O_CREAT,) also qualifies.  So do
 > other calls; perhaps most relevantly here, consider mknod() - some of
 > the ideas mentioned upthread have involved a userland daemon that
 > actually does use mknod() to create new device nodes.

Those are different in a fairly basic way: they create an object
within an existing filesystem namespace, as opposed to binding a
foreign object into the namespace.

A traditional device node is also a binding of a foreign object, but
it does it by creating a proxy object in an existing filesystem. There
is nothing inherently wrong with this, and AIUI "translators" are
a similar kind of thing. But the reason people have been floating
devfs schemes for the past 15 years or more is that it has various
unappealing properties, like being static and creating maintenance

Devfs schemes that don't abolish the proxy tend to get in trouble
because it's too many layers of indirection. (This is not the only
problem, but it's *a* problem.) Devfs schemes that do abolish the
proxy eventually discover that the fs part doesn't actually do
anything besides reimplement mount poorly.

This leads to a non-devfs architecture where device nodes are mounted
in /dev. The remaining trouble arises because they have to be
automounted, and this creates a nontrivial configuration management
problem. As I pointed out somewhere the other day (maybe in chat),
automounter config is a previously unsolved problem.

I think that approach is ultimately workable without major problems,
unlike ~all devfs schemes, but getting it right remains a research

It also seems desirable given such an architecture, and assuming an
adequate config system, to extend it to automount filesystems as well
as devices; this is probably the only way to make hotpluggable storage
volumes work robustly, but it's a can of worms and it involves
abandoning the traditional /etc/fstab.

 > > (2) Trying to support both dynamically loadable drivers and
 > > automatically named device nodes causes chicken-and-egg problems.
 > > (If a driver isn't loaded, it has no name entry, and therefore you
 > > can't cause it to be loaded by touching the name entry...)
 > That actually does not follow.  Attempting to look up the name (as
 > opposed to doing something with an existing name) could be what
 > triggers the load.

But what do you load? You need some kind of mapping - maybe it's good
enough to assume that accessing "/dev/bletch0" means "try to load a
driver from a module called bletch", but that runs into various
potential problems. (For example: how do you arrange things so
unprivileged users can e.g. demand-load the scanner while preventing
them from DoS'ing by repeatedly running slow or broken probe/attach
code associated with some random ISA driver?) But having an explicit
mapping table goes back to not being able to load something you don't
already know about.

The best solution I've heard of for this is to split all drivers into
separate probe/attach and operation parts; then you can load the probe
modules independently and create skeleton attachments into which the
"real" driver can be demand-loaded only when it's used. This would
probably be workable but it's a very intrusive redesign.

I don't claim to have the answer for this one; we looked at it in VINO
but never got to the point of concluding more than that this problem
exists and it's fairly hard to deal with. Then later on I concluded
that both devfs and modules/demand-loaded drivers are bad ideas and I
stopped caring. :-)

 > Of course, that means that the name exists in some sense, but that
 > sense does not have to be one that's visible to userland (while you may
 > want an administrative interface that lets you see them, it is in no
 > way essential).

If the name exists, it's not clear that there's anything (else) either
hard or wrong about making it visible to userland...

David A. Holland

Home | Main Index | Thread Index | Old Index