Subject: Re: upcalls?
To: Noriyuki Soda <soda@sra.co.jp>
From: Eduardo E. Horvath <eeh@one-o.com>
List: tech-kern
Date: 12/09/1999 09:36:40
> > > Perhaps Uresh Vahalia's "UNIX Internals" might confuse your memory.
> > > 4th paragraph of section 3.5 in this book describes that
> > > 	At any time, a process has exactly one activation
> > > 	for each processor assigned to it.
> > > But this sentence is wrong!
> > > This should be described as follows:
> > > 	At any time, a process has exactly one *running* activation
> > > 	for each processor assigned to it.
> > > I.e. there are more than one activations on each processor, if
> > > there are blocked activations.
> > 
> > Vahalia is primarily describing Solaris/SVR4, and it's my belief that
> > Solaris/SVR4 does *not* work this way.  The activation state is saved
> > by the thread library and then freed; it does not persist for the
> > duration of the block.  In the normal case, a threaded program on
> > Solaris has at most one LWP (which corresponds with a `NetBSD kernel
> > thread') per processor, and no more.
> 
> Hm, now I seem to know where you misunderstood.
> 
> Until Solaris 2.5, Solaris's thread implementation works as you
> described.
> But since Solaris 2.6, the implementation is changed from the above
> way to scheduler activations, because scheduler activations is better
> about performance than above way.
> SGI IRIX and Compaq Tru64 UNIX have been switched to scheduler
> activations before Solaris (because of performance reason).
> 
> Please note that Vahalia book was published before Solaris 2.6 was
> released.

I do not believe that is correct.

Threading originated on SunOS 4.x, with a user thread library.

SunOS 5.x (Solaris 2.x) added true (preemptive) kernel threads, called
LWPs.  However, they still had this neat user thread library.  Instead
of throwing out the thread library, it was adapted to use multiple
LWPs as needed.

The primary difference between the Solaris implementation and
scheduler activation (from my reading of the scheduler activation
paper) is that scheduler activations are generated from the kernel as
needed and call into a fixed entry point the thread library at the
time the events occur.  

The Solaris scheme is different.  It allocates a pool of LWPs to use
for scheduling the different CPUs.  But when a user thread is about to
issue a system call that could block, and the userland scheduler
determines thare are runnable user threads but no available LWPs, the
thread library calls into the kernel to fork a new LWP.  Now one LWP
enters the kernel and blocks, while the other can proceed to execute
some other userland thread.

The pool of LWPs is initialized to some minimal level specified by the
application, but may grow as needed as user threads block.  This
limits the number of kernel resources that need to be allocated to
support concurrency and does away with the need to dynamically create
kernel stacks for activations.  But I don't think that the number of
LWPs used by an instance of the thread library are limited by the
number of CPUs.

(It is also possible to bind a LWP to a user thread so you're doing
true kernel threading.)

I don't believe there were any major changes to the thread library
design since 2.0, so any differences between 2.5 and 2.6 should be
mostly cosmetic.

=========================================================================
Eduardo Horvath				eeh@netbsd.org
	"I need to find a pithy new quote." -- me