Subject: Re: optimizations [for non-debugging] amd64 kernels
To: Hubert Feyrer <hubert@feyrer.de>
From: Blair Sadewitz <blair.sadewitz@gmail.com>
List: port-amd64
Date: 09/11/2007 07:09:31
On 9/11/07, Hubert Feyrer <hubert@feyrer.de> wrote:
> On Tue, 11 Sep 2007, Hubert Feyrer wrote:
> > Now - speed, time, diskspace, ...?
>
> Doh, s/Now/How/
>

Oh, speed/time.  It's a lot faster.  My GENERIC.MP kernel was built
with -march=nocona, so I know it's not that alone.  When I get a
chance, I'll time some compile jobs.

Also, at:

http://bahar.aydogan.net/~blair/amd64-string.diff

is an enhancement for x86_64 memcpy/bzero/bcopy functions in
common/libc.  This is authored by fuyuki@hadaly.org and is a slight
modification of the latest version (<see
http://www.hadaly.org/fuyuki>) of what was originally posted in a PR
back around Jan/Feb.
I changed the size given to the cmpq instruction right below the
remark on non-temporal hints to match the cache size of my CPU
(2MB)/4.  I'm not sure what it should be by default.  Also, I added
the #ifndef _KERNEL, as AFAIK the kernel doesn't copy such long
strings.  I've been using this for ~6 mos now with no ill effects
insofar as I can tell.

I shared this with christos@ about 6-8 weeks ago,
and he said that it looked good to him.  I posted it to the list also,
but there was no response.
I'd appreciate it if someone who actually knew x86_64 assembly would
take a look at this and/or if others would test it so we could get it
in the tree at some point.

Regards,

--Blair