Subject: Re: Endianness conversion functions
To: Christian Biere <christianbiere@gmx.de>
From: David Laight <david@l8s.co.uk>
List: tech-misc
Date: 01/19/2007 20:55:41
On Thu, Jan 18, 2007 at 11:47:47PM +0100, Christian Biere wrote:
> 
> It's a bit sad that GCC doesn't recognize the shift/or construct though because
> I think it's cleanest version - considering that memcpy() might cause a huge
> penalty.

One of the big penalties for memcpy() (probably not relevant in this case)
is when it gets converted to a 'rep movsl' followed by a 'rep movsb' for
the remaining 0-3 bytes.
On modern cpus the setup cost for these instructions is significant,
so using one for the trailing bytes is particularly costly.

Using 'repne cmpsb' for memcmp() is similarly problematical if there are
likely to be differences in the first few bytes - I don't know how big
'few' is, but I sped up the dynamic linker by replacing the inlined memcpy()
with a call to a C routine.....

	David

-- 
David Laight: david@l8s.co.uk