Subject: Re: SMP/flogging a dead horse
To: Brian C. Grayson <bgrayson@marvin.ece.utexas.edu>
From: Ted Lemon <mellon@hoffman.vix.com>
List: current-users
Date: 08/30/1998 10:19:04
> Maybe I'm wrong, but don't all Modern Processors (PPC 604e,
> Pentium, PPro, PII, UltraSparc, R10000, etc.) and/or their bus
> controllers support cache coherence automatically in hardware? 
> If so, any extra work (over a uniproc config) for maintaining
> coherence would only occur when two processors are _modifying_
> the _same_ memory location, modulo cache line size (or one
> writing and the other reading).

Work is work, whether it's done in hardware or software.  And the way
cache consistency hardware works, if _any_ processor modifies memory
that more than one processor has cached, it blows (at least) that part
of the cache in those other processors.  For best performance, you
need to be doing processing that doesn't involve a lot of cache
invalidations.

> This would only occur in a multithreaded kernel, or a
> multithreaded or SYSVSHM app, and not in everyday traditional
> single-thread Unix workload stuff like gcc, sh, awk.  And a
> well-written _multithreaded_ app is written/optimized to
> minimize coherence traffic, among other things.  :)

A lot of what the kernel does is to manipulate user data, so in fact
this is also more of a problem than you think.   It's true that a Big
Lock MP kernel isn't going to take any cache invalidations against
kernel data, but that isn't as much of a help as one might prefer.
Furthermore, there's no particular reason to think that an SMP kernel
would take significantly more hits - it's really a question of what
it's doing.

			       _MelloN_