current-users: Re: Multiprocessor with NetBSD ?

Subject: Re: Multiprocessor with NetBSD ?
To: NetBSD-current Discussion List <current-users@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: current-users
Date: 06/06/2001 03:46:49
[ On Tuesday, June 5, 2001 at 23:20:37 (-0500), mike stone wrote: ]
> Subject: Re: Multiprocessor with NetBSD ?
>
> it would be entirely possible to build a shared-nothing system that can
> run on a box with any number of CPUs.   the main point is that each
> CPU would have sole authority over any resources it touches, and that
> any other CPU would have to go through the authoritative CPU for access
> to those resources.

That's what some literature calls an "Associated Processor"
configuration, in direct contrast to a "Symmetric MultiProcessor", and
indeed in contrast to a multiprocessor (homogeneous or heterogeneous) in
which there's one or more main processor as well as one or more
specialised I/O processors.

As far as I remember, and as far as I can tell from my readings now,
"symmetry" in reference to tightly-coupled homogeneous parallel
processing systems has *always* referred to the ability of every
processor to do any I/O, as implied by the ability of the operating
system itself to run on any and all processors.  Although I can't find a
direct reference defining the term "SMP", I have only ever heard it in
reference to systems built with homogeneous microprocessors where I/O
handling was done by the main processors too (i.e. without
architecturally separate I/O processors).

Oddly the massive and perhaps most complete single reference on the
subject (and maybe most authoritative too) "Computer Architecture:
Concepts and Evolution" by G. A. Blaauw and F. P. Brooks, Jr. (1261 pp,
AW, 1997), doesn't even mention the phrase "symmetric multiprocessor"
(at least so far as I can find) even though it was most certainly in use
in the industry a long long time before that.  However they do say the
first homogeneous multiprocessor system in which the "supervisory task"
(i.e. the operating system) itself runs on the next available processor
was probably the Burroughs D825, built in 1960.  (They only ever use
"symmetry" to refer to instruction sets.)

Traditionally mainframes have almost always had separate I/O processors,
though some like the CDC 6600 (one of Seymour Cray's early creations
from 1964) had 10 memory-coupled peripheral processors that did all I/O
and device control functions, and which also ran the operating system
itself.

So although "symmetric multiprocessing" seems to imply some direct and
specifically tailored involvement of the operating system, it also
implies many things about the hardware architecture.  I would guess that
of any multiprocessor hardware that NetBSD can run on, there are really
only two choices for an implementation that makes use of all (main)
processors:  1) master with slave(s); and 2) symmetric.  I.e. a
symmetric multiprocessing implementation of NetBSD implies only that the
kernel (including device drivers) can run on any processor.  In other
words it is irrelevant whether there's one big lock for all kernel data
or many -- the result will always be one that can be described as "SMP".
The difference, as I believe has already been mentioned several times,
is simply one of quality and efficiency.

Indeed in an AP configuration where I/O duties might be split amongst
separate processors, one begins to end up with a system that only
differs from a "loosely coupled" system by the fact that RAM is shared,
and indeed if inter-processor communications are sufficiently fast
(eg. over a word-wide or even multi-word-wide bus or cross-bar switch),
then it is probably advantageous to give each processor separating RAM
and treat resulting whole as a true loosely-coupled cluster.

Of course IBM's apparent success at building massive parallel processing
machines would seem to indicate that a combination of these approaches
is ideal; i.e. build a scalable cluster of tightly-coupled SMP machines.

BTW, according to Blaauw & Brooks it is _incorrect_ to refer to the
processor contention in accessing shared memory as the "von Neumann
bottleneck."

Oh, and they also mention an alternate meaning for the phrase "fine
granularity" too, i.e. in the hardware architecture side where each
fine-grained processor is capable of as little as only one operation!

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>