Subject: M:N and blocking ops without SA, AIO
To: None <tech-kern@netbsd.org>
From: Matthew Mondor <mm_lists@pulsar-zone.net>
List: tech-kern
Date: 03/01/2007 05:22:45
=46rom my understanding, an M:N threading model which uses a pool of
kernel LWPs without using Scheduler Activations would need to be able to
poll asynchroneously in its userland library scheduler for any
potentially blocking syscalls.

Threads which wouldn't invoke calls internally yielding them back to the
user-space scheduler could be mapped to an LWP so that the kernel may
preempt them efficiently, but a form of preemption would also be
required in the user-space scheduler to detect this event and assign
them to an LWP, potentially with minimal kernel help to efficiently
detect this condition.

As for I/O blocking syscalls and locking functions, they could be
remapped to be non-blocking by the library in userland to be handled by
the userland scheduler.  However:

Although I see that kqueue provides good and efficient mechanisms to
handle network I/O polling, I noticed lately that it doesn't provide
similar functionality for disk I/O (or at least the man page doesn't
mention that it might work on other descriptors than sockets, pipes,
FIFOs and vnodes (for modification notification)).

After a short exchange on IRC it appeared that AIO would be a feature
which we're missing to allow I/O asynchroneous operations and polling
for disk I/O.  Without this functionality, any thread blocking for the
disk would also have to be considered candidates for mapping to an LWP
at least temporarily (or to a pool of disk I/O LWP slaves).

It becomes apparent that with all required tools, only non-yielding
threads performing number crunching (and those for which heuristics show
enough crunching to justify it) would be directly mapped to an LWP...

Do others agree that an M:N implementation without SA done with a
user-space scheduler would greatly benefit from AIO?  If so, were there
ever plans for AIO to eventually be implemented on NetBSD?  What are the
known challenges involved that prevented such a feature so far, if any?
Other than the buffer cache, DMA and interrupts handled by drivers and
limited part of VFS, UFS layout, the rest of the disk I/O system is
still currently unknown to me.

Thanks,
Matt