Subject: Re: CVS commit: src/sys/netipsec
To: None <perry@piermont.com>
From: M. Warner Losh <imp@bsdimp.com>
List: source-changes
Date: 08/14/2003 23:02:15
In message: <87r83oflh9.fsf@snark.piermont.com>
"Perry E. Metzger" <perry@piermont.com> writes:
:
: David Laight <david@l8s.co.uk> writes:
: > Indeed, on a modern x86 you do not want (ever) to execute movs{b,w,l}
: > unless it is repeated AND the repeat count is considerable
: > (unless you are optimising for space).
: >
: > ISTR that something like:
: > 1: mov (%esi,%ecx,4),%eax
: > mov %eax,(%edi,%ecx,4)
: > dec %ecx
: > jnc 1b
: > is faster than 'rep movsw' for %ecx < (about) 16.
:
: Are you including the overhead of calling the function?
:
: In any case, it is possible to teach GCC to use different sequences
: than the ones it uses right now.
how does one teach gcc to generate optimal code for all sizes from 16
to 64k?
Warner