Subject: Solaris MP
To: None <netbsd-advocacy@netbsd.org>
From: Miles Nordin <carton@Ivy.NET>
List: netbsd-advocacy
Date: 12/12/1999 00:22:15
I've ranted on this list before about how the truly relevant goals in
MT/SMP work are poorly-understood (or more reasonably, simply incompletely
implemented) by FreeBSD and Linux.  As part of a truly atrocious operating
systems class, I was told to read the following interesting paper:

 http://csel.cs.colorado.edu/~vasa/courses/csci3753/1061.pdf

It is a White Paper, meaning that it is a pseudo-technical paper with the
stated purpose of trying to sell Sun products, rather than to educate or
advance human knowledge like a true research paper. However, this also
means a non-technical person can fairly easily get something out of it.

The paper doesn't get me as close to understanding the link between MT/SMP
and real-time scheduling as I'd like to be, given that I keep bringing it
up without knowing what I'm talking about, but it's a start. I definitely
reccommend reading it if you are interested in the evolution of SMP stuff,
or if you often get into arguments about it.

Here are my observations about the paper.  You should probably read the
paper before you read them, or maybe even read just the paper alone.

Seriously.  don't be lazy.  read the paper.  then come back.



 Encouraging implications:
  o Sun used NetBSD's originally-proposed strategy of getting MT to work
    first, and then working on SMP.  They found the debugging work that 
    they could do on uniprocessor MT useful.  Almost all of their
    work seems highly relevant to uniprocessor systems.  This defends past
    decisions, suggests that we are on the right track, and makes an
    encouraging statement about the future of NetBSD MT/SMP.

  o The implementation of Sun's locking primitives is supposedly designed
    to promote clean code.  At least the idea that some implementations
    are better at promoting future clean code than others puts this stuff
    right up NetBSD's alley, as far as NetBSD's ability to make a useful
    contribution to the field.

  o ``Having threads'' and ``using threads'' are not as closely
    intertwined as one might pessimistically assume.  MT is still a 
    gigantic work, but it's not necessarily something that can't be
    committed until it's finished.  The paper mentions a distinction
    between MT-safe and MT-hot, for example, and hints at instances 
    where merely being ``MT-safe'' is trivial.  It may well be practical
    to debug and commit an MT framework with only a few MT-hot subsystems,
    perhaps subsystems which are MT-hot only for the purpose of debugging
    the framework, yet still be architecturally well ahead of global-lock
    implementations.  There are a lot of decisions to make in the
    implementation that have nothing to do with boasting about how many
    locks you have--indeed, the entire paper is written about such
    decisions.

  o Threads are not necessarily a burden.  Application programmers use
    them not because they want to annoy us, but because it makes their
    work easier, their code cleaner, and their goals easier to reach.  As
    the paper states outright, kernel code is likely to enjoy these same
    benefits.  Adding MT to the kernel shouldn't be misconceived as
    something that will make every bit of kernel code from then on harder
    to write--it may well do just the opposite.

  o Kernel threads decrease the relevancy certain burdensome optimizations
    that one needs to worry about in an old-school kernel.  For example,
    an interrupt handler doesn't need to return as quickly if it can
    simply throw itself onto the regular scheduler when it starts running
    too long.  Thus, threads could improve the performance of the kernel,
    accelerate the pace of kernel development, promote the sanity and
    happiness of kernel developers--or all three.

  o Having threads in the NetBSD kernel will permit us to write kernel 
    code and device drivers that others can't use at all, or can't use 
    to the fullest benefit.

 Discouraging implications:
  o Having threads in the NetBSD kernel will encourage us to write kernel 
    code and device drivers that others can't use at all, or can't use 
    to the fullest benefit.

  o Sun threw out the BSD codebase before they started working on this.
    We can't do that.  However, if and when we finish we're likely to have
    superior code to Sun's, since free Unixes typically already beat 
    Solaris on Performance and system call overhead (and filesystem
    advances, VM robustness, software RAID usefulness, networking
    implementation, source code usability, having a well-maintained
    Coda port, pleasant non-comittee-moron documentation, and the
    fact that they come with working C compilers, and stuff.)

  o Their implemenatation required writing debugging and profiling
    tools, and doing some fairly ambitious testing and optimization.  This
    offers some perspective on just how unreasonable an undertaking this
    is for a single person working unfunded (in his free time).

  o The implementation of synchronization primitives as function calls
    rather than language primitives probably limits egcs's ability to make
    useful optimizations.  For example, volatile variables could be cached
    in a register until the end of a critical section.  Perhaps we could
    work around this by attaching some egcs-hint to the function through a
    header file #define, but more likely this problem will remain out of
    our hands for a long time, if not forever.

The usual problem with my emails applies:  although I may use pretentious
language, my actual understanding of the subject is pretty poor and
simplistic.  Since the people who do understand would probably prefer to
write code than rant for several pages on mailing lists, the chances of my
stating a convincing lie and never getting corrected are high.  (ex., it's
happened before--I said NTFS isn't transaction-based, and it does have
some kind of transaction log).

That said, I'd obviously welcome potentially interesting comments.

-- 
Miles Nordin / v:1-888-857-2723 fax:+1 530 579-8680
555 Bryant Street PMB 182 / Palo Alto, CA 94301-1700 / US