Subject: Re: improving kqueue
To: None <tech-kern@netbsd.org>
From: Matthew Mondor <mm_lists@pulsar-zone.net>
List: tech-kern
Date: 02/23/2007 16:20:21
Continueing quite an old thread now that I had a little time to install
-current and check the reference documentation (svr4) about adding the
possibility to be notified about vnode modifications events for a whole
tree rather than a single vnode via kevent:

If I understand, Bill's reference to svr4 book (the design of the unix
operating system from Bach), was to point out that it is possible to
lazily mark vnodes as "is being watched" at lookup time if it is under a
watched parent vnode.  This would seem to easily allow to return a knote
with the event type/file handle information back to userland, say for a
new EVFILT_VNODE_TREE or such filter type.

This being all fine, there are other problems, however:

- The functionality would need to be restricted to the superuser,
considering that file handles bypass permissions (mentionned in
fhopen(2) man page)

- A file handler needs to be resolved back to a pathname for userland.
There doesn't seem to be a syscall to perform this, nor a reverse cache
in the kernel currently (probably because of the kernel memory wastage
this would imply)

- Moreover, a vnode may resolve to multiple path names, because of
possible hard links


Does this imply that the userland daemon would need to perform the following:

- Setup a kqueue filter on the wanted parent vnode (requested directory)
- Recursively scan the parent direcory using getfh(2) and build a hash
  table or btree allowing duplicate targets (with unique fh keys)
  (as well as using fhstat(2) and cache permissions, optionally)
- Verify for any meanwhile queued kevents and adapt cache (changes may
  have occurred during the recursive cacheing process).

If such a daemon was to accept client connections via a library, it
would need to use ancilary data to determine the actual user
credentials, and to ensure to only report events (converted to paths)
for files which can normally be accessed by that user (either using
access(2) (which would mean that for each client the daemon would
probably need to spawn a new process and revoke its privileges to that
of the user) or perform a userland brewed equivalent check using
pre-cached permissions).


I'm unsure yet of how large the cache would be for general use (although
it would be easy to write a small program to test this on various paths,
to determine cost of building the cache, performing lookups and
determining the size of that cache for various paths).

If the system was to only be used by modules running as the superuser,
access checks would probably not be necessary and dlopen could be used
to load them;  A function of the module could be called to propagate
events to them or such.

I would be interested to know more about the particular project(s) this
feature was requested for.  For a fam daemon replacement, the
permissions checking would seem unavoidable, IMO.

It would also be great to hear other's ideas about how this could be
done more effectively...

Thanks,
Matt