Subject: events. (Re: last month's events thread.)
To: None <tech-kern@netbsd.org>
From: Todd Whitesel <toddpw@best.com>
List: tech-kern
Date: 12/22/1998 01:56:38
This refers to a thread from last month :)

> Jason and I have talked about putting in a generic "events" interface
> for some time.

Cool. Things like this (and WSCONS, UVM...) make me glad I went with NetBSD.

> To userland, the events interface would be via read/write to a file
> descriptor that would be acquired either via a pseudodevice or a
> special socket like the routing sockets we have now. (Jason is adamant
> that he wants to use a socket -- I'm far from convinced, if only
> because I prefer using file system name spaces, but he's much better
> at this stuff than me.)

I prefer something that is selectable; in fact I wish _everything_ useful
could be made selectable somehow (especially wait()). I've worked on real
commercial apps where threads were not an option but we had to watch a huge
list of fd's, child processes, and such, and we didn't have much choice but
to ship with a huge loop of non-blocking tests and microsleep trickery. Barf.

I must confess my ignorance about the 'pseudodevice' method (does this use
ioctl() on a /dev/pd0?) and I'm not sure how 'file system name spaces' would
operate either (do I have to opendir/readdir all the time to see new names
popping in or what?).

But I think it should be noted that there is already one events interface
using sockets which we have used successfully for years: X11. It sure ain't
perfect, but it does work and it does tackle some of the same issues.

> To the kernel, the interface would probably be via registration of
> callbacks.

Hmm, would kernel threads make it more attractive to have an "input queue"
for each event receiver, and the receiver simply blocks on the queue? Are
kernel threads likely to become standard or is there value in keeping them
optional?

I personally do not trust callbacks/signals very much. I prefer it if they
are either (a) "pure" and totally re-entrant, (b) set a global flag and
return, or (c) throw an exception (e.g. longjmp). There is a reason UNIX
shells are the grossest programs in the world: they actually do everything
you have to do if you want to handle signals robustly.

> Via the interface, one could register for interest in certain kinds of 

Like:
    - the stat() data for this inode has changed.
    - the stat() data of anything under this inode has changed. (expensive?)
    - mode sense data(?) for this device has changed. (disk insert/remove)
    - umm... just map everything into the file system and use the first two?

> events, and then get notification (via messages arriving on the file
> descriptor or via the callbacks) of all the sorts of events we're
> worrying about.

Right. I was wondering, what's the "overflow" behavior like? I don't think
we want to allow denial-of-service attacks that flood the event system. One
of two things must be true: event sources must be capable of blocking as
pipe-writers currently do when the pipe fills; or event sources must be able
to 'merge' events in some well-defined way, so that a finite queue size is
preserved yet no vital information is lost.

Example: if APM writes multiple sleep, wake, sleep, wake sequences to a
process that is blocked on something unrelated like a network socket, we
don't want the event queue to slowly back up and choke kmem, nor do we
want APM to block! So we could define some interfaces as mailboxes of
sticky bits, such as "system as slept at least once since last event"
and "system has woken at least once since last event".

> One requirement I think we need for APM events is that the notified
> parties must have the chance to finish their work before APM goes to
> sleep.  I don't particularly care if that's by calling synchronously, or
> waiting for a thread delivering the notification to exit, or something

It would be nice if we could do something like guaranteeing all event
receivers at least N cycles of user CPU time to respond to the signal,
and within that time root can do anything (even delay or cancel the action)
but non-root can only ask for a single finite extension (bounded by a
sysctl limit). This is actually getting kind of complicated, but I think
it's because the APM case is more than just event delivery, it's careful
negotiation before systemwide actions occur.

> Of course, for user-level applications there should be some timeout
> beyond which the APM code says "screw you" and transitions to the next
> power state anyway.

Right, although user programs that wait on the network could potentially get
screwed through no fault of their own. I'm betting this is a Hard Problem,
so there is only so much we can do about it, except teach app writers to be
more careful. The difference between this and shutdown/halt is that when the
user opens his laptop back up, he expects everything to still be there. But
if he's telnetting via cell-modem or something, this might be a little tricky.

Todd Whitesel
toddpw @ best.com