Subject: Re: Enhancements to bus_space_barrier
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 10/19/2001 17:34:39
On Fri, Oct 19, 2001 at 02:19:51PM -0700, Jonathan Stone wrote:

 > I also recall a design choice that certain entries in the full R x W
 > matrix simply wouldn't be used that often; and that the decision not
 > to have a full matrix reflected that, as well as the inability of
 > "common" hardware to do the full matrix.

Yes, mostly.

I think really what happened is "we looked at what the Alpha did".  The
Alpha has "memory barriers" and "write memory barriers".  They have the
following semantics:

	MB		Ensure that all pending loads and stores complete
			before any subsequent loads or stores are issued.

	WMB		Ensure that all pending stores complete before
			any subsequent stores are issued.[*]

Note that these instructions do NOT actually flush the store buffer
or accelerate the progress of memory operations .. they only guarantee
a barrier.  To force a flush of the store buffer, you'd have to:

	ldiq	t0, ALPHA_K0SEG_BASE
	mb
	ldq	t0, 0(t0)

i.e. force a load after the barrier.

Anyway, the Alpha bus_space_barrier() is implemented thusly:

	if ((f & BUS_SPACE_BARRIER_READ) != 0)
		alpha_mb();
	else if ((f & BUS_SPACE_BARRIER_WRITE) != 0)
		alpha_wmb();    

Therefore, it is safe to say that the existing semantics are more like:

	BUS_SPACE_BARRIER_RW_RW
 and
	BUS_SPACE_BARRIER_W_W

For most applications, indeed I would say "for the vast majority", this
is quite sufficient.

Editors note: I'm not sure what happens in the case of repeated stores
to or repeated loads from the same address, i.e.:

	load		// 1
	load		// 2
	load		// 3

or

	store		// 1
	store		// 2
	store		// 3

I'm not sure what the Alpha architecture does here -- does it issue those
loads in order (and all of them?)  Similarly for the stores.  I'm pretty
sure it does, but I haven't found text to say either way in the Alpha ARM
(I'm probably just not looking in the right place).

[*] The semantics of WMB actually are:

	Ensure that all pending stores to memory-like regions
	complete before any subsequent stores to memory-like
	regions are issued.

	Ensure that all pending stores to non-memory-like regions
	complete before any subsequent stores to non-memory-like
	regions are issued.

	[The ordering of stores with respect to memory-like vs.
	non-memory-like is not specified by WMB.  But for the
	purposes of bus_space, we don't really need to make this
	distinction.  --thorpej]

 > 
 > Isn't the real issue here to document a hierarcy of barriers, which
 > define how an implementation should "round up" a requested barrier to
 > what the hardware can acutally do?  (Isn't that what the current
 > discussion really illustrates?)

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>