Subject: Re: Multiple page sizes and the pmap API
To: None <eeh@netbsd.org>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 12/06/2001 14:01:37
On Thu, Dec 06, 2001 at 09:39:00PM -0000, eeh@netbsd.org wrote:

 > Good idea, but this looks like one of the klugy-est APIs I've ever
 > seen.

Well, I chose this API for a reason... more below...

 > | 	* pmap exports an array of page sizes, like so:
 > |
 > | 		vsize_t pmap_page_sizes[];
 > | 		int pmap_page_nsizes;
 > 
 > If UVM really needs to know this stuff (and I'm not yet convinced
 > it does, at least for the device pager) it would be better to use 
 > uvm_setpagesize() rather than export a couple of global variables.

Yes, it does.  In order to use a large page, you have to have a
suitably aligned PA *AND* VA.  You need to give UVM some way of
knowing what alignment to try to get, and what alignments it can
fall back on in case it can't get the larger one (falling back to
the default case of "base page size").

 > | 		pmap_enter(pmap, va, pa,
 > | 		    VM_PROT_READ|VM_PROT_WRITE|PMAP_PAGESIZE(1), 0);
 > | 	    or
 > | 		pmap_kenter_pa(va, pa,
 > | 		    VM_PROT_READ|VM_PROT_WRITE|PMAP_PAGESIZE(1));
 > 
 > Once again, encoding the page size in with the protections looks
 > really klugy.  It would be better to add a separate size parameter 
 > and let pmap calculate how large a page can be conveniently used.
 > Or simply have a separate parameter with the page size.

Okay.  The reason that I proposed putting it in the "prot" bits
("prot" really should be renamed "flags", and the current "flags"
arg to pmap_enter() should either be folded into it, or renamed
to "flags2") is to:

	(1) Avoid adding another argument.  pmap_enter() already
	    requires passing args on the stack on MIPS and ARM,
	    and it'd be nice to not make the problem any worse,
	    especially considering that we have a whole slew of
	    empty bits in one of the arguments already.

	(2) It reduces the amount of code you have to change.  Code
	    that isn't changed to know about alternate page sizes
	    continues to use the base page size.

 > I'd recommend changing uvm_setpagesize() to take 2 parameters:
 > a page size and an identifier.  Those would be inserted into
 > the table, and the identifier would be passed to pmap_enter*().
 > Then pmap_enter() could use the identifier directly in the PTE
 > rather than have to calculate the bits based on the page size
 > or index into some array.

I didn't want to do away with the notion of a "base page size", i.e.
the smallest granularity that the VM system has to deal with.  I also
wanted for machdep code to be able to #define pmap_page_nsizes to a
constant to allow the compiler to optimize better if a platform only
supports one page size.  I suppose it could be done even better by
change pmap_page_sizes[] to pmap_page_sizes() and letting that also
evaluate to a constant in that case.

Now, the other reason for using a table index is to be able to
(eventually) clump PAGE_SIZE pages into larger alternate-sized
pages.  When pages are lumped into large-pages, each vm_page in that
clump would have the index into the table stored in it so that you
could easily determine the size and the start of the page its clumped
in to.

Using an index into a table rather than some other arbitrary identifier
provides the most flexibility and the most compact storage of the
required information.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>