Subject: vague proposal for new scheduler primitive: asynchronous "sleep"
To: None <tech-kern@netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 05/09/1999 18:57:18
I'm of the opinion that a high-efficiency implementation of threads
and the posix aio interface will require parts of the kernel may need
to be rewritten to be non-blocking and event-driven.  One way to ease
gradual, staged implementation of these changes would be to allow
blocking and non-blocking versions of various routines to coexist.

It occurred to me recently that the following (new?) scheduler
primitive could help this..  It's analagous to the result of a
high-speed collision between timeout() and tsleep().  For now, I'm
calling it "asleep", though I'm not really attached to the name..

void asleep(void *ident, 
            void (*callback)(void *arg, int), 
	    void *arg, int timo);

"asynchronous sleep".  Returns immediately after setting things up
such that callback() will be called at some point in the future;
either as:

       callback(arg, 1);

after `timo' clock ticks, or as 

       callback(arg, 0);

if some process calls wakeup(ident) before the timeout expires.  as
with tsleep, timo==0 -> doesn't time out.

You can then create asynchronous versions of various kernel primitives
by splitting non-blocking chunks of code into separate functions and
by replacing calls to sleep or tsleep with tail-calls to asleep,
without needing to touch the interrupt-side code which kicks the
sleeper awake.

Random questions:
 - Did I just reinvent a square wheel?  (i.e., is this worthwhile at all?)

 - sleep/tsleep don't have to allocate memory; they just chain the
struct proc into a hash table; asleep, on the other hand, does need to
allocate space (just like timeout); however, the interface could be
rearranged so the caller allocated the space for it.

 - is the timeout arg actually useful?  I'm not sure.

 - do we need an unasleep, analagous to untimeout, to remove a queued
callback without firing it?  If we have unasleep(), the timeout, when
needed can be handled by separate use of timeout().

 - Exactly when does the callback occur?  Having it occur
synchronously out of wakeup() is almost certainly wrong.  I'm leaning
towards having it run in the context of a kernel worker thread.  (a
third possibility would be for it to happen in some sort of soft
interrupt level).  If we use kernel threads,  some applications
 might want to take advantage of this and allow the callback routine
to block, so it might make sense to have multiple such (pools of)
threads, with a queue identifier passed as one of the arguments to
asleep(), so that there would be a way to prevent blocking callbacks
from getting in the way of non-blocking ones..

Other uses (possibly stretching things a bit)..

      - this might be part of a "kernel event handler" subsystem..

      - this could conceivably be used for a generic
        soft-interruptish mechanism.

-----

Implementation thoughts:

The easy way to do it is to add a second hash table to the scheduler,
managed in parallel with the proc-based hash table.  This would
approximately double the cost of wakeup().

An alternate approach with better performance (but more in the way of
code changes) would involve making the "asleep-queue-entry" structure
and proc structure both start with a common generic
"thing-which-can-be-woken" structure, so that both can be kept in the
same hash table.

Comments?

					- Bill