Subject: Re: curproc and nfsd
To: None <rick@snowhite.cis.uoguelph.ca>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 05/26/2004 11:53:24
In message <200405261517.LAA33360@snowhite.cis.uoguelph.ca>
rick@snowhite.cis.uoguelph.ca writes:

Hi Rick,

>What I think might be a good idea and would avoid a fair number of
>the context switches is to have a kernel context (aka kernel thread)
>that can sleep, which is dedicated to handling network
>input. It would pick up the received packets straight from the network
>device driver and either:
>- drop them off in the receive queue of a socket
>OR
>- continue on, if there is a socket upcall

NetBSD 2.0 and -current have a manpage for kcont(9). Its a bit skeletal
right now, and the hooks aren't yet there to do what its intended for.

kcont is a way of bundling up what some people might call a
``continuation'' (in the Lisp/Scheme sense, cut down drastically to
fit ANSI C and an in-kernel environment.  I've used very kcont-like
objects, over several years, to implement upper-layer protocols
sitting, as you say, on top of socket-callback functions.

If you're not used to contuation-passing style (and even that is
pretty minimal, given C has no lexically-scoped functions, closerus,
etc), you can think of kcont as a callback-function and some
associated state, all on steroids.

Eventually (soon?) I intend kcont to be usable as a
continuation-passing mechanism for I/O on buffers, as a direct
alternative to (lw) process-based [l[]tsleep/wakeup: bundle up a
continuation function, the object you're issuing an I/O on, and other
state (everything you had at your fingertips before you were so rudely
interrupted by having to wait for an I/O).  The I/O subsystem would
call the continuation after the I/O is done.

That seems doable for I/O on uios: add a kcont into struct uio.

But I have to admit that getting the VFS layer (vget(), etc)
to do anything other than ltsleep() is a nightmare.


>If the thread doing the socket upcall can sleep, it can do an NFS RPC and
>there would be no need for separate nfsd threads and context switches to them.
>(Related to your discussion, these threads would have to have enough
> "proc like" information that they would satisfy the needs of the VFS/VOP
> calls. It would be really nice to know what uses are made of uio_procp,
> cn_procp (or thread * for FreeBSD5) and minimize that. Maybe a small
> structure that just has the fields of "struct proc" or "struct thread"
> that the VFS/VOP calls need from these?)
>
>Since they're kernel only and network receive only, they could be trimmed
>down and get special treatment from the scheduler.

I was thinking more of bundling up the necessary state, call back the
first-level continuation in whatever context the file-plus-I/O
subsystem currently calls wakeup(). If one needs to trampoline from
there to a lower-priority context -- say, a softnet context -- then
kcont already has the functionality to do that for you.

Even if that turns out to be too hard, a kcont-like framework can (as
you allude) be very useful for handing data streams to and from
sockets, parsing record boundaries, etc., without going to a
full-blown process context (and ensuing context switches).