Subject: Re: New kinetic figures
To: None <thorpej@zembu.com>
From: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
List: port-arm32
Date: 02/10/2001 13:52:31
> So, this will make the umapping of the entire address space happen
> while the process is not curproc.  In fact, the unmap in the non-curproc
> path was already happening, but there was a redundant unmap in exit1().
> 
> It made things ever so slightly faster on my 700MHz P-III -- I wasn't
> expecting to see much improvement on that system :-)  Anyway, please
> try it on your ARM systems and tell me what improvement you see.
> 

When configuring GNU Make, this cuts the number of full cache flushes in pmap_remove by 70%, and the number of partial flushes by 50%, and, with the other changes to the ARM pmap that I have, finally moves sa110_cache_purgeID off the number one spot in the profile graph I have.  Top of the list is now bcopy_page, followed up closely by uvm_fault.

In terms of overall performance, before I started hacking the ARM pmap a profiled run of GNU Make's configure script was taking 3m9s wall-clock on an otherwise idle system.  The same job is now taking 2m45s, and it is noticeable that we have:

1) Halved the total number of cache flushes.
2) Halved the number of calls to raisespl/splx

So I think your change is definitely a good move.

Richard.

Top ten routines prior to changes (excluding mcount)
 15.00     21.82    21.82                             _mcount
  6.91     31.87    10.05    24025   418.31   418.31  sa110_cache_purgeID
  4.64     38.62     6.75    35528   189.99   189.99  bcopy_page
  4.39     45.01     6.39   123484    51.75   353.57  uvm_fault
  3.79     50.52     5.51  2139083     2.58     2.58  splx
  3.49     55.60     5.08  2118976     2.40     2.40  raisespl
  2.98     59.93     4.33   539910     8.02     9.12  pmap_vac_me_harder
  2.83     64.05     4.12    41802    98.56    98.56  bzero_page
  2.66     67.92     3.87   196475    19.70   197.54  data_abort_handler
  2.49     71.55     3.63                             mcount
  2.47     75.14     3.59   719901     4.99     4.99  lockmgr
  2.45     78.70     3.56   253677    14.03    24.30  pmap_enter_pv

Top ten routines after all changes

 14.56     18.85    18.85                             _mcount
  5.82     26.38     7.53    37590   200.32   200.32  bcopy_page
  4.46     32.15     5.77   130557    44.20   260.67  uvm_fault
  4.35     37.78     5.63    12728   442.33   442.33  sa110_cache_purgeID
  3.26     42.00     4.22  1066604     3.96     3.96  splx
  3.21     46.15     4.15   202325    20.51   167.71  data_abort_handler
  3.17     50.25     4.10    39890   102.78   102.78  bzero_page
  3.01     54.14     3.89   287465    13.53    27.63  pmap_enter
  3.00     58.02     3.88   746751     5.20     5.23  lockmgr
  2.80     61.64     3.62  1850805     1.96     1.96  pmap_pte
  2.71     65.15     3.51                             mcount
  2.60     68.52     3.37   264467    12.74    12.74  pmap_enter_pv