Subject: Re: cacheflush() proposal
To: Chris G. Demetriou <cgd@netbsd.org>
From: Eduardo E. Horvath <eeh@one-o.com>
List: tech-userlevel
Date: 12/02/1998 20:58:44
I think we're going a tad bit overboard here.

The primary stated purpose of this was to assist the implementation of
things like JIT compilers.  That's quite reasonable.

Then I see this other stuff and think "Eek! How can I implement this?"

On 2 Dec 1998, Chris G. Demetriou wrote:

> Jason Thorpe <thorpej@nas.nasa.gov> writes:
> > How about <sys/cachectl.h>, and we just be basically compatible with the
> > IRIX interface?   Maybe also implement cachectl(2)?  (make a range cacheable
> > or uncacheable)
> 
> hmm.  i wonder if there are VM system interactions with making a range
> of memory cacheable or uncacheable.

Yes.

> e.g. the sparc has to Do The Right Thing re: object alignment for
> objects shared between processes, and if it can't it "has" to mark
> them uncacheable.
> 
> Are there any interactions with that type of thing, that require the
> VM system to become involved?
> 

Bit more complicated here, at least on V9 systems.  The D$ needs to be
aligned properly, but the E$ must be enabled for all cacheable address
spaces or it will cause memory errors (ECC is implemented in the E$) and
disabled for all non-cacheable (device register) addressses.

VM system needs to determine what type of address it is and mark each page
as cacheable or not according to what it maps to.

> 
> Is there any reason to make cachectl() anything other than a normal
> syscall(), like mprotect(), etc.

What is the purpose of a cachectl() call?  Why would a user application
need to controll this?  For that matter, why would a user application need
to explicitly flush caches?

There are several higher-level operations that I could see an application
might legitimately want to do.  

	o It might want to synchronize the data and instruction caches
	  because it just finished generating some code that it will want
	  to execute.

	o It might want to force a memory barrier to make sure any data it
	  has stored is now globally visible to other processors.

	o It might want to force read synchronization while spinning on a
	  lock.

I think it would be better to think in terms of these operations rather
than explicit operations on the caches.  Flushing a cache can have quite
different results on different cache implementations, and some caches are
simply not meant to be flushed.

I would prefer something like:

	void memsync(start, size, what)
	void* start;
	size_t size;
	int what;

where what is:

	CACHESYNC

	Synchronize data and instruction caches over this region prior to
	executing newly generated code.

	STOREBAR

	Force all pending writes to stable medium.  Used to make data
	stores globally visible.

	LOADFLUSH

	Force the next read of that location to fetch from memory rather
	than any caches to make certain the CPU sees any changes to
	memory made by other devices.

For a much more complete (and much more complex) look into this subject
take a look at the _SPARC V9 Architecture Manual_ section 8.4 where it
describe the more interesting memory models, or Appendix D where they are
specified.

=========================================================================
Eduardo Horvath				eeh@one-o.com
	"I need to find a pithy new quote." -- me