Subject: Re: Using different cache modes for r/o vs r/w pages
To: None <thorpej@wasabisystems.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 01/30/2002 10:55:46
> To work around some errata in current XScale CPUs, I need to use
> different cache modes for r/o vs r/w pages. In particular, I need
> to use write-back for r/w pages and write-though for r/o pages (if
> r/o pages have a write-back cache mode in the PTE, a write-fault
> to the page doesn't always cancel the store to memory, causing
> obvious problems).
Presumably a future stepping will fix this...
> The patch essentially does:
>
> * Replace PT_CACHEABLE with a pte_cache_mode[] array, indexed
> by protection code (e.g. VM_PROT_READ, VM_PROT_READ|VM_PROT_WRITE,
> etc.) Replace all uses of PT_CACHEABLE with an access of the
> array.
This bit seems fine.
>
> * Replace uses of (PT_B | PT_C) when clearing cache bits from a
> PTE with a new global pte_cache_bits. This will be more
> important for a future change, which will make the kernel
> use the extended cache modes available on the XScale.
Aren't all remaining uses of (PT_B | PT_C) used in the inverse (ie ~(...)).
If so, it would save an invert operation on every use if this were
pte_noncache_bits.
>
> * When changing permission on a page, clear the existing
> cache bits in the PTE and set new cache bits based on the
> new page protection. This requires a Wb cache operation
> on the page. (I certainly hope it doesn't require a
> WbInv.)
I can only see one place where you are doing this. Are you sure the other
places where bits are changing are not also changing a mapping for an
existing page?
>
> I would like someone to review my changes to see if I've missed anything
> really obvious. In particular, someone who's actually hacked on the
> cache stuff inside the pmap (Richard?)
Have you looked at the modified-emulation code? That makes pages
read-only, so presumably you will have to mess with the cache bits when
doing that. Maybe also for the referenced-emulation code.
Other than that I can't think of anything you've not covered. My major
concern now is general and not directly related to what you are trying to
achieve. The pmap code is starting to grow a lot of baggage to support
the newer processors -- arm9 XScale etc -- which is of no use to older
machines other than to make the pmap run more slowly.
I wonder if there is some way we could structure the pmap code such that
if support for only one class of CPU were required (ie because I'm
building a specific kernel for an ARM7) we could take advantage of this
knowledge to cut out some of the crud.
NOTE: I don't want to fork the pmap code, I just wonder if there is a way
of expressing some of it, maybe with macros, that would allow some of the
indirections to be bypassed when we know we are building a kernel that
will only run on one type of processor. For example, we know that even
GENERIC kernels for NetWinder and CATS will only contain a StrongARM.
R.
PS. "diff -p" is your friend (well mine, in this case): it labels each
hunk of the patch with the function it comes from.