Subject: Improving the arm32 pmap
To: None <port-arm@netbsd.org, port-arm32@netbsd.org>
From: Steve Woodford <scw@wasabisystems.com>
List: port-arm32
Date: 01/13/2003 12:14:17
Hi,

I'm sure this is something which has a number of people have toyed with
for quite some time now, with varying degrees of success.

As it happens, Wasabi have allocated a bunch of my time and given me the
resources to have a good crack at optimising the arm32 pmap, whether
through a wholesale re-write or just bending the existing code into shape.

I know some of you have already whacked on the pmap a bit, and Chris
Gilbert has already given me some of his experimental changes to look
over.

So, I'm soliciting opinions on what's wrong with the existing pmap, and
suggestions on either how to bend existing code into shape, or how to do
things differently/better with new code.

For example:

  Problem:
  - L1 page tables are "per process", and their allocation requires 16KB
    contig/aligned RAM. It's hard to allocate new L1 tables, given those
    constraints, which often leads to L1 starvation.
  - L1 page tables are also zeroed whenever they are allocated to a pmap.
    The kernel's descriptors are also copied over every time.

  Solutions:
  - Make use of the mmu's "domain" feature as a primitive form of ASID.
    This allows L1 tables to be shared between multiple pmaps by doing
    "lazy" fixup of user L1 entries at fault time.
  - Allocate a fixed, smaller, number of static L1 tables at boot time.
  - Kernel always uses domain "0", which is always enabled. The kernel's
    L1 descriptors are hardwired in all L1s at boot time.

Of course, if someone has tried the "domain" approach before, I'm all
ears. ;-)

Cheers, Steve

-- 

Wasabi Systems Inc. - The NetBSD Company - http://www.wasabisystems.com/