Subject: re: Need sparc openboot reference (fwd)
To: None <port-sparc@netbsd.org>
From: Eduardo E. Horvath <eeh@one-o.com>
List: port-sparc
Date: 01/30/1999 08:05:26
On Sat, 30 Jan 1999, Kapil Chowksey wrote:

> On Saturday, 30 January, Eduardo E. Horvath wrote:
> 
> > > I'd wonder about the performance of using ASI's....
> > 
> > Should be faster since they explicitly bypass the MMU and don't take TLB
> > miss traps.  However, the current scheme using gcc macros is not optimal
> > since gcc can't do as good a job at scheduling the instructions.  The best
> > solution would be to modify gcc so pointers could be explicitly assocated
> > with ASIs.
> 
> Bypass ASI's should be better from the cache standpoint also because
> using virtual addresses pollutes an on-cpu D$ cache line. Now, we will
> the hitting the E$ cache.

No, for H/W registers the sideffect bit in the TTE must be set so it
bypasses both the D$ and the E$, so cache pollution is not an issue.

> I think the pointer approach that you mention will hurt gcc's register
> allocation because the immediate-indexed addressing modes will not be
> available eg. compiler cannot generate :
> 
> 	ldxa	%l2, [%l1 + FOO_OFFSET] ASI_BYPASS
> 
> but it will have to do
> 
> 	set	FOO_OFFSET, %l3
> 	ldxa	%l2, [%l1 + %l3] ASI_BYPASS

Actually, I don't think reg+reg is allowed for ldxa.  You would have to
do:

	add	%l1, FOO_OFFSET, %l3
	ldxa	[%l3] ASI_PHYS_NOCACHE, %l2

Since most of the time you're accessing multiple registers you could also
do:

	wr	%g0, ASI_PHYS_NOCACHE, %asi
	ldxa	[%l1 + FOO_OFFSET] %asi, %l2

Which allows you to quickly follow it up with:

	stxa	%l3, [%l1 + FOO_OFFSET2] %asi
	ldxa	[%l1 + FOO_OFFSET3] %asi, %l4

And since the ASIs are fixed values, loading the %asi register can be
hoisted well in advance.

Even with the current __asm() macros the scheduling isn't too bad.  The
main problems are that the destination address must be placed into a
register rather than use register+offset addressing, and the membar #Sync
instructions I have in there to force memory access completion are rather
expensive (although it may be possible to remove them.)

=========================================================================
Eduardo Horvath				eeh@one-o.com
	"I need to find a pithy new quote." -- me