Subject: Re: Performance of various memcpy()'s
To: David Laight <firstname.lastname@example.org>
From: Bang Jun-Young <email@example.com>
Date: 10/23/2002 11:37:45
On Wed, Oct 23, 2002 at 12:04:37AM +0100, David Laight wrote:
> > BTW, where's 'rep movsw'? memcpy_rep_movsl is pretty much the same as
> > libc memcpy.
> Indeed - however there is a significant saving in not using
> rep movsb to move the odd bytes.
> The setup cost for movsb is quite significant, on my athlon (IIRC)
> the cost for rep movsl is such that it is only worth using
> for moderate length copies - indeed using word copies for short
> transfers and MMX for long transfers could easily be a win over
> rep movsl.
This is really interesting. With addition of just two lines of code
to memcpy, it's 20% faster for data < 512 bytes!
BTW, I noticed that our i386 memcpy() in libc checks for overlapping,
although the manpage says "to copy byte strings that overlap, use
Bang Jun-Young <firstname.lastname@example.org>