Subject: Re: CVS commit: src/sys/netipsec
To: None <perry@piermont.com>
From: M. Warner Losh <imp@bsdimp.com>
List: source-changes
Date: 08/14/2003 23:02:15
In message: <87r83oflh9.fsf@snark.piermont.com>
            "Perry E. Metzger" <perry@piermont.com> writes:
: 
: David Laight <david@l8s.co.uk> writes:
: > Indeed, on a modern x86 you do not want (ever) to execute movs{b,w,l}
: > unless it is repeated AND the repeat count is considerable
: > (unless you are optimising for space).
: > 
: > ISTR that something like:
: > 1:	mov	(%esi,%ecx,4),%eax
: > 	mov	%eax,(%edi,%ecx,4)
: > 	dec	%ecx
: > 	jnc	1b
: > is faster than 'rep movsw' for %ecx < (about) 16.
: 
: Are you including the overhead of calling the function?
: 
: In any case, it is possible to teach GCC to use different sequences
: than the ones it uses right now.

how does one teach gcc to generate optimal code for all sizes from 16
to 64k?

Warner