port-arm: Re: Using different cache modes for r/o vs r/w pages

Subject: Re: Using different cache modes for r/o vs r/w pages
To: None <thorpej@wasabisystems.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 01/30/2002 10:55:46
> To work around some errata in current XScale CPUs, I need to use
> different cache modes for r/o vs r/w pages.  In particular, I need
> to use write-back for r/w pages and write-though for r/o pages (if
> r/o pages have a write-back cache mode in the PTE, a write-fault
> to the page doesn't always cancel the store to memory, causing
> obvious problems).

Presumably a future stepping will fix this...

> The patch essentially does:
> 
> 	* Replace PT_CACHEABLE with a pte_cache_mode[] array, indexed
> 	  by protection code (e.g. VM_PROT_READ, VM_PROT_READ|VM_PROT_WRITE,
> 	  etc.)  Replace all uses of PT_CACHEABLE with an access of the
> 	  array.

This bit seems fine.

> 
> 	* Replace uses of (PT_B | PT_C) when clearing cache bits from a
> 	  PTE with a new global pte_cache_bits.  This will be more
> 	  important for a future change, which will make the kernel
> 	  use the extended cache modes available on the XScale.

Aren't all remaining uses of (PT_B | PT_C) used in the inverse (ie ~(...)).

If so, it would save an invert operation on every use if this were 
pte_noncache_bits.

> 
> 	* When changing permission on a page, clear the existing
> 	  cache bits in the PTE and set new cache bits based on the
> 	  new page protection.  This requires a Wb cache operation
> 	  on the page.  (I certainly hope it doesn't require a
> 	  WbInv.)

I can only see one place where you are doing this.  Are you sure the other 
places where bits are changing are not also changing a mapping for an 
existing page?

> 
> I would like someone to review my changes to see if I've missed anything
> really obvious.  In particular, someone who's actually hacked on the
> cache stuff inside the pmap (Richard?)

Have you looked at the modified-emulation code?  That makes pages 
read-only, so presumably you will have to mess with the cache bits when 
doing that.  Maybe also for the referenced-emulation code.

Other than that I can't think of anything you've not covered.  My major 
concern now is general and not directly related to what you are trying to 
achieve.  The pmap code is starting to grow a lot of baggage to support 
the newer processors -- arm9 XScale etc -- which is of no use to older 
machines other than to make the pmap run more slowly.

I wonder if there is some way we could structure the pmap code such that 
if support for only one class of CPU were required (ie because I'm 
building a specific kernel for an ARM7) we could take advantage of this 
knowledge to cut out some of the crud.

NOTE: I don't want to fork the pmap code, I just wonder if there is a way 
of expressing some of it, maybe with macros, that would allow some of the 
indirections to be bypassed when we know we are building a kernel that 
will only run on one type of processor.  For example, we know that even 
GENERIC kernels for NetWinder and CATS will only contain a StrongARM.

R.

PS.  "diff -p" is your friend (well mine, in this case): it labels each 
hunk of the patch with the function it comes from.