Subject: Re: devfs, was Re: ptyfs fully working now...
To: Christos Zoulas <christos@zoulas.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 11/14/2004 13:13:10
--JYK4vJDZwFMowpUq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Nov 13, 2004 at 03:30:43AM -0500, Christos Zoulas wrote:
> On Nov 13, 12:10am, wrstuden@netbsd.org (Bill Studenmund) wrote:
> -- Subject: Re: devfs, was Re: ptyfs fully working now...
>=20
> | On Fri, Nov 12, 2004 at 09:08:24PM -0500, Christos Zoulas wrote:
> | I like the idea of a file that contains info about modes and owners, an=
d I
> | hadn't thought about whiteouts - good idea. However I think a better way
> | to do this is a binary database. I think the keys should be locators;
> | where the device is in the config hierarchy. For each entry, we keep mo=
st
> | of the info you have below - name, uid, gid, mode (or ACL), mtime, atim=
e,
> | ctime (I don't think birthtime matters as it won't show up in stat), and
> | type.
>=20
> Fine, I agree that the device mapping should be using locators. I think
> that birthtime should go in; it would have been nice for stat to be able
> to access it, but that is not the case yet...

Ok, birthtime is in then.

> I just wanted to stash dev_t somewhere to returning it to userland for
> stat. It is not meant to be used for anything else.

Oh, ok. It's good for that. And it's probably needed too.

> | What I was thinking was that as we boot, devices register their nodes
> | during configuration. Drivers add default info (like owner, mode, and m=
ost
> | importantly default name & locator) while registering. Then, like you
> | said, we read a file on boot. However my thought is that we merge the t=
wo
> | databases, based on locator. That way devices that are here now and were
> | here before have the exact settings as last time. Nodes that were here
> | last boot but aren't now show up with a NULL device pointer. Nodes that
> | are new show up with default settings.
>=20
> It does not have to be at boot, but at mount time. Unless we want to mount
> devfs after kernel autoconfiguration which I think is a bit radical. I pr=
efer
> to have it mounted by userland. Then people who don't like devfs don't ne=
ed
> to use it, and regular devices can still be used in the transition period.

When we transition to wedges, I think we REALLY REALLY need devfs. I do
not think that wedge minor numbers will be stable (or it will be harder to
keep them stable) across boot, and so if we jump to devfs at the same time
(or before), life will be easier.

Also, for the same reason that init needs /dev/console, I think we need=20
devfs to be mounted before init starts.

What benefit is there for not using devfs once we have transitioned to it=
=20
(both as a project and at a given installation)?

> | While I talk a fair bit about wedges above, these thoughts apply to all
> | device nodes. It's just that wedges and disks are the things that move
> | around a lot yet we really realy want permissions to not change. Things
> | like serial ports don't move around much.
> =20
> I agree that wedges and disks need special consideration. I just have not
> sat down and analyzed the requirements partition binding to device nodes =
yet.

Please chat with Jason. He said he has a devfs project too, to go with=20
wedges.

Part of the idea is to make partitions be free-standing entities. If I=20
understand GPT partitioning, each partition includes a UUID which can be=20
used to uniquely resolve them; it won't matter what disk they are on, we=20
can always uniquely identify them (modulo making a complete disk image &=20
playing with that while the disk is mounted).

Apple partition maps have names, and would probably need to be tied to a=20
node. But especially with some sort of WWN locator, they could be quite=20
stable.

MBR and disklabel partitioning schemes will remain tied to a given device=
=20
node. Oh well...



Oh, this part doesn't fit so well in the above, but I have been thinking=20
some about what "mounting" a devfs would mean, and what we could do with=20
it. The biggest thing I see we would want to deal with is the case where=20
the saved database is corrupt. Say there was a crash during a file write.

My thought is that when devfs mounts (even when the kernel or init=20
automounts it), the mountfrom name is taken as the base name for two file=
=20
names, and both are opened by the kernel code. devfs then looks at the=20
newer of the two files. If it passes db consistency checks (looks like a=20
good file, no bogus lengths, etc.), then it is the one that gets read into=
=20
the running devfs. If it fails (either botched up, or doesn't exist), we=20
look at the other, old file. If it passes, we use it (chances are there=20
was an update issue). If it too doesn't work out, we just run with the=20
node defaults.

Then when we need to save out the database, we overwrite the older of the=
=20
two files. Thus we will (should) always have a good version of the db=20
around.


Another thing I think we'd need is a tool that you can point at a=20
directory (an existing /dev) and it'll generate a db file.

Take care,

Bill

--JYK4vJDZwFMowpUq
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFBl8pmWz+3JHUci9cRAiBiAJ9c37hDzM171tYCPsy9uiYM1rEnuACeLR4E
jr1192nVJdUe4I7zpJISQWs=
=7FAd
-----END PGP SIGNATURE-----

--JYK4vJDZwFMowpUq--