Subject: Re: VM system with ARM32 port
To: Neil A. Carson <neil@causality.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-kern
Date: 10/16/1997 20:10:04
On Thu, 16 Oct 1997 11:08:42 -0700,  "Neil A. Carson" <neil@causality.com> writes:

[Gotchas with a virtually-addressed cache and the Mach-derivedVM code]

>The present VM system in NetBSD doesn't work too efficiently on the
>StrongARM: This is because of the virtual nature of the instruction and
>data caches (things not cached by physical address), and also because
>when the instruction cache is cleaned, it must be cleaned in its
>entirety.


You don't say explicitly, but I assume the ARM cache is both virtually
indexed and virtually tagged?  It's been a while, but I thought the
performance probems one can run into wtih virtually-indexed,
virtuall-tagged caches were well-known after the Mach port to HP-PA
machines in the late 80s. (9000/835 or thereabouts.)
Virtually-indexed, phyically-tagged caches (e.g., mips3) don't seem
to be nearly so bad.

One simple trick you can use it to keep a software hint (a cache)
inside the ARM-specific pmap code. The hint indicates whether the
I-cache is clean.  Set it when you flush the i-cache; clear it when
you go to userspace.  (You might also need to change it if the kernel
I-space mappings change, e.g., after loading an LKM but before
executing any of the code.)

This gets you the reduced-icache-flush wins of a multiple-page
multiple-page alternative to pmap_enter(), but without actually
changing any existing pmap code, or writing glue code for other ports
that don't have the multiple-page interface.

I've tried several performance tweaks for related isseus in other VM
systems. I can suggest some other ideas, but I'd need a bit more
detail on the StrongARM VM hardware.  And I imagine it's pretty likely
someone at DEC -- maybe cgd? -- has looked at more ARM-specific
tweaks.  But if you don't hear from someone else, get in touch with me
via email.