Subject: Re: pmap.c hacking...
To: Greg Oster <oster@cs.usask.ca>
From: Andrew Doran <ad@netbsd.org>
List: port-i386
Date: 03/29/2007 12:39:18
On Wed, Mar 28, 2007 at 08:47:38PM -0600, Greg Oster wrote:

> In an effort to reduce the amount of time spent in "system" when 
> doing a pkgsrc build, I've been mucking around with pmap.c on amd64.  
> One of the changes I've made is to have one page of 
> pmap_tlb_shootdown_job's per CPU.  This eliminates the global variable 
> used to hold the "free pool" of such jobs, and also eliminates all 
> the locking contention associated with that free pool.  From the 
> testing I've done, performance for building a set of 166 packages 
> ('time make -k -j 8') goes from this (baseline):
>  
>  18925.7u 25439.2s 3:05:03.98 399.5% 0+0k 229+2870194io 3371pf+57w
> 
> to this (when just doubling the number of shootdown job slots available 
> per CPU):
> 
>  18776.5u 23900.5s 2:56:29.96 402.9% 0+0k 292+2864111io 3374pf+0w
> 
> to this (double the number of job slots, and eliminate the global 
> "free queue"):
> 
>  17941.4u 20939.2s 2:43:56.35 395.2% 0+0k 6048+2639046io 6331pf+0w

Cool!

> with my most recent tweaks to pmap.c (included here).
> 
> Comments are more than welcome (yes, the diff would need to be 
> cleaned up before it could be checked in :) )  A review by someone
> more pmap/amd64-savvy would certainly be appreciated. :)  I suspect 
> similar changes could be made to the i386 pmap.c. (I just havn't made 
> time to make the changes there too and do testing)

As an aside the amd64 pmap is lacking the lazy switching that the i386 pmap
does, which eliminates reloads of %cr3 / TLB flushes when switching to
kthreads, or when switching between threads within the same process. Also
IIRC the xen pmap is an old, slightly modified version of the i386 pmap and
should be merged back in with an #ifdef XEN.

The amd64 pmap was written to work on i386 too. Can anyone comment on what
the original issues were that prevented sharing it? Frank?

Andrew