Subject: Re: New kinetic figures
To: None <thorpej@zembu.com>
From: Neil A. Carson <neil@causality.com>
List: port-arm32
Date: 02/08/2001 22:40:06
> You don't need a separate call.  A separate call is silly.
> 
> You can optimize your traversal of the tables in pmap_remove(), and
> if you're clever (like the i386 pmap is, esp. in the MP branch kernel),
> you can defer all the TLB operations until the end (the i386 just does
> a full non-global TLB flush if there are > 16 TLB invalidations pending).

I already similarly optimised out cache flushes to the end in order to
do the non-global flush if there was <n pending a couple of years ago on
the ARM port. The problem is still multiple calls to pmap_remove however
- one for each mapped segment in the pmap AFAIR. So the single call is
still more optimal since it only happens once, and doesn't involve any
page table traversal at all (which could still displace chunks of cache
with low associativity even if the lower levels are not traversed when
the process is very large - which I'll admit it probably won't be on an
ARM32). The cleverness mainly helped for small stack bits AFAIR. But I
like even better your/Richard's idea of getting rid of it entirely so
work on that instead :-)

On the other hand I've not looked at this stuff for so long now that I
can't remember the half anyhow.