Subject: Re: CVS commit: src/lib/libc/arch/i386/string
To: Perry E. Metzger <perry@piermont.com>
From: David Laight <david@l8s.co.uk>
List: current-users
Date: 02/04/2005 19:54:36
On Fri, Feb 04, 2005 at 09:20:30AM -0500, Perry E. Metzger wrote:
> 
> Note that for many calls, mem* is now being inlined by gcc using its
> built in code (unless you do -fno-builtin etc.)

Yes, last time I looked it generated the 'rep movsl', 'reb movsb'
sequence - which is definitely sub-optimal.

For short fixed size copies I've also seen sequences of 'lods; stos'.
I've not investigated the execution times of those instructions, but
the could easily be worse than using many simpler instructions.

I've also an idea that memcpy() could read the last 4 bytes of the input
buffer first, then write them to the target buffer once the block-copy
is over (as I did with memset).  This would save some braches without
really affecting the cost of the whole operation.

	David

-- 
David Laight: david@l8s.co.uk