Subject: Re: Performance of various memcpy()'s
To: Bang Jun-Young <junyoung@mogua.com>
From: Frank van der Linden <fvdl@wasabisystems.com>
List: tech-perform
Date: 10/16/2002 00:04:58
On Wed, Oct 16, 2002 at 04:18:30AM +0900, Bang Jun-Young wrote:
> Another attached patch is i686 version of copyin(9) that makes use
> of MMX insns. It works well with intops-only programs, but doesn't
> with ones like XFree86 that uses FP ops. In this case, it would be
> helpful if NPX handling code was imported from FreeBSD (they have
> i586 optimized version of copyin/out(9)). Can anybody give me some
> comments wrt this?

Yup, there's a lot to be had by using SSE(2) instructions, copying
in 128bit quantities is quite a useful thing to do. It's been
on my todo list for a while.

I've been playing with a few SSE memcpy functions myself, but
did not get around to adding the extra checks to the FP
save/restore code yet. There are some checks that need to
be done. It comes down to:

	* Don't mess up the current process' FP state, so save it if necessary. 
	* Don't bother if there's not enough bytes to copy, since you're
	  paying the price of an entire FP save if someone was using the FPU.
	* If you're going all the way, and are using memcpy with SSE in
	  the kernel too, be careful about interrupts. If you come in
	  during the FP save path, it will mess up things. And maybe
	  you don't want to use FP in an interrupt at all, it'll
	  cause a ton of fp save/restore actions.

It's not overly complicated to do, but it's important to take all
scenarios into account. copyin/out is the simplest case, since
you should be in a process context when doing those.

I'll probably have some time to spend on this soon (next month).
If you're going to work on it before than, please let me review
the changes.

- Frank