Re: [PATCH] i386 copy.S routine optimization?

To: Yann Sionneau <yann.sionneau%gmail.com@localhost>
Subject: Re: [PATCH] i386 copy.S routine optimization?
From: David Laight <david%l8s.co.uk@localhost>
Date: Sun, 16 Jun 2013 21:29:42 +0100

On Mon, Jun 10, 2013 at 10:20:25PM +0200, Yann Sionneau wrote:
> Hello,
> 
> I already talked about this with Radoslaw Kujawa on IRC, I understood 
> that it is far from trivial to say if it is good to apply the following 
> patch [0] or not due to x86 cache and pipeline subtleties.

Please inline patches in the mail, that way they are definitely in the
mail archive. It also makes them much easier to review.
Alse ensure you quote the cvs revision of the main file - otherwise
the line numbers won't match.

If you want to make a measurable improvement to copystr() don't use
lodsb or stosb and use 32bit reads from usespace.
I can't remember whether it is best to do misaligned reads and aligned writes
(or aligned reads and misaligned writes), in any case if you do aligned reads
you don't have to worry about faulting at the end of a page.

Look at the strlen() code for quick ways of testing for a zero byte.
For amd64 the bit masking methods are definitely faster.

Probably the worst part of the current code is that the 'jz' to skip
the unwanted write will be mis-predicted.

        David

-- 
David Laight: david%l8s.co.uk@localhost

References:
- [PATCH] i386 copy.S routine optimization?
  - From: Yann Sionneau

Prev by Date: Re: [GSoC 2013] Defragmentation for FFS
Next by Date: device vnodes, and structural confusion
Previous by Thread: [PATCH] i386 copy.S routine optimization?
Next by Thread: siisata tries to attach incompatible devices?
Indexes:

Home | Main Index | Thread Index | Old Index