Subject: Re: improving kqueue
To: Matthew Mondor <>
From: Bill Studenmund <>
List: tech-kern
Date: 02/23/2007 13:47:29
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Feb 23, 2007 at 04:20:21PM -0500, Matthew Mondor wrote:
> Continueing quite an old thread now that I had a little time to install
> -current and check the reference documentation (svr4) about adding the
> possibility to be notified about vnode modifications events for a whole
> tree rather than a single vnode via kevent:
> If I understand, Bill's reference to svr4 book (the design of the unix
> operating system from Bach), was to point out that it is possible to
> lazily mark vnodes as "is being watched" at lookup time if it is under a
> watched parent vnode.  This would seem to easily allow to return a knote
> with the event type/file handle information back to userland, say for a
> new EVFILT_VNODE_TREE or such filter type.
> This being all fine, there are other problems, however:
> - The functionality would need to be restricted to the superuser,
> considering that file handles bypass permissions (mentionned in
> fhopen(2) man page)

Depends on what we pass back. If we pass back a file handle, then yes, we=
need to restrict to superuser.

> - A file handler needs to be resolved back to a pathname for userland.
> There doesn't seem to be a syscall to perform this, nor a reverse cache
> in the kernel currently (probably because of the kernel memory wastage
> this would imply)

However if we pass back a path, then we don't need to restrict the return.=
We just require an access check on the to-be-returned path before return.=
If you can see it, then we can return it.

I think we will need to figure out how to handle reverse paths. There are=
more and more things going on (well, not yet in NetBSD :-( ) where reverse=
paths will be really useful/needed.

> - Moreover, a vnode may resolve to multiple path names, because of
> possible hard links

Or 0, an der Mouse noted. Oh well. For this type of thing, if you can't=20
get a path, chances are the event won't do you much good.

> Does this imply that the userland daemon would need to perform the follow=
> - Setup a kqueue filter on the wanted parent vnode (requested directory)
> - Recursively scan the parent direcory using getfh(2) and build a hash
>   table or btree allowing duplicate targets (with unique fh keys)
>   (as well as using fhstat(2) and cache permissions, optionally)
> - Verify for any meanwhile queued kevents and adapt cache (changes may
>   have occurred during the recursive cacheing process).

You can key all of it off of inode #. While it is questionable to build a=
file handle from an inode #, if you have an fh and get the inode, chances=
are that all file systems will always return the same inode # for the same=
file for its life. So you need only index on a 64-bit # as opposed to=20
arb-len file handles. :-)

> If such a daemon was to accept client connections via a library, it
> would need to use ancilary data to determine the actual user
> credentials, and to ensure to only report events (converted to paths)
> for files which can normally be accessed by that user (either using
> access(2) (which would mean that for each client the daemon would
> probably need to spawn a new process and revoke its privileges to that
> of the user) or perform a userland brewed equivalent check using
> pre-cached permissions).

Yes, if we do path construction in userland.

> I'm unsure yet of how large the cache would be for general use (although
> it would be easy to write a small program to test this on various paths,
> to determine cost of building the cache, performing lookups and
> determining the size of that cache for various paths).
> If the system was to only be used by modules running as the superuser,
> access checks would probably not be necessary and dlopen could be used
> to load them;  A function of the module could be called to propagate
> events to them or such.
> I would be interested to know more about the particular project(s) this
> feature was requested for.  For a fam daemon replacement, the
> permissions checking would seem unavoidable, IMO.
> It would also be great to hear other's ideas about how this could be
> done more effectively...

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.3 (NetBSD)