Subject: Re: improving ssh performance on sun4m systems
To: None <port-sparc@netbsd.org>
From: Charles Shannon Hendrix <shannon@widomaker.com>
List: port-sparc
Date: 03/15/2002 20:01:01
On Sat, Mar 16, 2002 at 01:31:59AM +0300, Valeriy E. Ushakov wrote:

> That's the point you miss.  Functions that you write yourself that are
> not in the library anywhere (and those that are too) *do call* ".umul"
> &co from libc.so for multiplication &co.

I'm not missing or debating this.

I'm talking about things outside the library. Obviously .umul &co are
in the library, and 'a*b' will also be in the library because of this.

> main:
>         save %sp,-104,%sp
>         sethi %hi(a),%o1
>         ld [%o1+%lo(a)],%o0
>         sethi %hi(b),%o2
>         ld [%o2+%lo(b)],%o1
>         call .umul,0		! <-- .umul comes from libc.so

I understand instructions routed to functions: I wrote a library in
SPARC assembly to emulate the FPU of a Gould minicomputer some years
ago for some FORTRAN code.

I also understand the overhead of the call is very low.  

All I am saying is that -mv8 does more than .umul &co.  Instruction
ordering is the major thing, but there may be more, I haven't checked
beyond mul/div and reordering.

for the curious, I did run some tests on .umul vs mul (and .div) and
the overhead is undetectable in a tiny program, until you hit millions
of loops, and I had no patience to wait.

My system no longer has libraries without the -mv8 code, and I don't
know how to build code without it, so I couldn't check to see how
the non-mv8 code would have performed.

-- 
UNIX/Perl/C/Pizza__________________________________shannon@widomaker.com