Re: [PATCH] i386 copy.S routine optimization?

To: Yann Sionneau <yann.sionneau%gmail.com@localhost>
Subject: Re: [PATCH] i386 copy.S routine optimization?
From: David Laight <david%l8s.co.uk@localhost>
Date: Sun, 16 Jun 2013 21:29:42 +0100

On Mon, Jun 10, 2013 at 10:20:25PM +0200, Yann Sionneau wrote:
> Hello,
> 
> I already talked about this with Radoslaw Kujawa on IRC, I understood 
> that it is far from trivial to say if it is good to apply the following 
> patch [0] or not due to x86 cache and pipeline subtleties.

Please inline patches in the mail, that way they are definitely in the
mail archive. It also makes them much easier to review.
Alse ensure you quote the cvs revision of the main file - otherwise
the line numbers won't match.

If you want to make a measurable improvement to copystr() don't use
lodsb or stosb and use 32bit reads from usespace.
I can't remember whether it is best to do misaligned reads and aligned writes
(or aligned reads and misaligned writes), in any case if you do aligned reads
you don't have to worry about faulting at the end of a page.

Look at the strlen() code for quick ways of testing for a zero byte.
For amd64 the bit masking methods are definitely faster.

Probably the worst part of the current code is that the 'jz' to skip
the unwanted write will be mis-predicted.

        David

-- 
David Laight: david%l8s.co.uk@localhost

References:
- [PATCH] i386 copy.S routine optimization?
  - From: Yann Sionneau

Prev by Date: Re: [PATCH] i386 copy.S routine optimization?
Next by Date: biedt betrouwbare krediet./offre de prêt fiable
Previous by Thread: Re: [PATCH] i386 copy.S routine optimization?
Next by Thread: biedt betrouwbare krediet./offre de prêt fiable
Indexes:

Home | Main Index | Thread Index | Old Index