Subject: Re: Prototype kernel continuation-passing for NetBSD
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Matt Thomas <matt@3am-software.com>
List: tech-kern
Date: 03/26/2004 10:09:15
On Mar 26, 2004, at 9:15 AM, Jonathan Stone wrote:

>
> In message <6.0.3.0.2.20040325150418.036996c8@localhost>Matt Thomas 
> writes
>> At 01:04 PM 1/28/2004, Jonathan Stone wrote:
>
> [...]
>
> Clearly I should improve kcont(9), because it hasn't been fully
> understood.
>
> kcont() will eventually/soon have functionaitly for asynchronous
> notification.  It really _is_ intended to replace [l]tsleep/wakeup(),
> so that (for emxaple) you can implement an nfs server without
> requiring a context-switch (the sleep/wakeup) to notify the nfs-server
> that i/o on a buffer has completed.  Crunch the numbers on filling
> over half a 10GbE pipe; context-switch per operation is prohibitive.

I understand.  It would be nice if entire kcont API could be specified
now.  Even if it's unimplemented.

> That would (obviously) require passing a kcont object down through the
> VFS layer, and chaining kconts off a struct buf with pending I/O.
> I've looked at adding a struct kcont* to one of the other structs
> already passed by reference.
>
> You're absolutely right about the assumptions I made with IPLs. I'd
> respond that IPLs shoulid be explicity ordered; and the (macppc)
> become less shabby. Or we could fix kcont to not rely on that
> assumption; I haven't thought hard about how to do that.

While I can see your point, I'm not sure I agree.  IPLs can map
to hardware IPLs (think vax) and placing a restriction on ordering
is not correct.

Note that on i386 that higher IPLs do not block lower IPLs.  In a
sense, they are independent.  IPL_NET blocks network interrupt but
BIO or TTY interrupts may proceed.

You can't compare IPL levels in NetBSD.

>> At this point, I would instead make generic softintr required 
>> fuctionality,
>> extend their capability, and kill kconts.  Or kill generic softintr 
>> and
>> use kconts instead.  One or other but not both.
>
> I'd kill the generic softints.  kcont lets you build and queue a kcont
> intended to be called *much* later, such as when an I/O on the kcont
> "object" completes. If you start reworking generic softints to add
> that ability (so that we truly don't need both), you very quickly
> arrive at something like kcont.

One thing the softintr interface has over kcont is the mandate that
the establishment/disestablishment of softintr's must be be done in
a threaded context.

kcont's should have the same restriction: no allocation on the fly.
Especially in "interrupt" context.


> Code currently using generic softints would have to allocate a struct
> kcont. That doesn't cost much, especially if it can be embedded in an
> existing long-lived memory object (like a softc).

All the code does a softintr_establish now to get one.  So changing
won't be difficult.

> I've also had ... considrable experience implementing
> application-level protocols inside the kernel.  For that, kcont is a
> *huge* win over generic softints.
>
> One use I made of (a prior incarnation of kcont) was implementing
> sendfile() and splice(), using socket-upcall functions.  There, one
> very quickly finds oneself wanting to shut down sockets (or even close
> them, or write to them...) from inside an upcall "callback" function,
> which is in turn in the middle of a function activation which is
> frobbing the socket.

I have my own reasons for wanting fully fleshed kconts/softintrs.

The major issue I see with kconts is its naive understanding of IPLs.
This needs to be addressed ASAP.  IMO, you really should have a kcq
per {IPL,ktread=yes/no} needed.
-- 
Matt Thomas                     email: matt@3am-software.com
3am Software Foundry              www: http://3am-software.com/bio/matt/
Cupertino, CA              disclaimer: I avow all knowledge of this 
message.