Subject: Re: arm32 pmap changes
To: None <port-arm32@netbsd.org>
From: Chris Gilbert <chris@paradox.demon.co.uk>
List: port-arm32
Date: 06/25/2001 00:29:33
Just to update where I'm upto.

On Friday 22 June 2001 12:47 am, Chris Gilbert wrote:
[snip]

> Over the next few days I'm planning:
> use a pool for the pmap structs, should improve performance (will benchmark
> to confirm this)

done.

> clean up pmap struct, (has a couple of seemingly dead entries in it,
> pm_unused1 and pm_dref, need to verify they really are dead (cats thinks
> they're dead though)

done.

> implement pmap_map_ptes and unmap_ptes.  This is based on Richard's
> version. I plan to use if for pmap_remove initially.  Expanding it into
> pmap_enter and vac_me_harder.

done.

I still need to make pmap_enter use map_ptes (and that'll cur it's time down 
a bit as well)

Profiling now shows less calls to pmap_pte, and somehow pmap_map_ptes is 
actually faster than pmap_pte as well.  (note that remrunqueue is actually 
including the idle loop as well, hence the large amount of time in it ;)

Currently it actually looks like I should look into reducing the number of 
calls to splx, we call splvm a hell of a lot in the pmap, I might look at the 
locking down in the i386 version see if we can replace the splvm's with it.  
Another major gain would be to sort out pmap_release so it doesn't have to 
walk the whole of a the L1 table looking for items to free off (we should do 
that in pmap_remove)

Note that the profile is from doing a time make configure of gmake.

Another optimisation (again something richard suggested) is that we should 
zero pages when idling, means they can be allocated faster :)

Cheers,
Chris

Profiling now shows:
Flat profile:
 
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  us/call  us/call  name
 27.76     32.57    32.57                             _mcount
  7.42     41.28     8.71                             mcount
  4.41     46.45     5.17   246628    20.96   106.35  uvm_fault
  3.31     50.33     3.88       14 277142.86 417857.14  remrunqueue
  3.22     54.11     3.78    42211    89.55    89.55  bcopy_page
  2.80     57.40     3.29  2331473     1.41     1.50  splx
  2.63     60.49     3.09    21616   142.95   142.95  sa110_cache_purgeID
  2.49     63.41     2.92  1086199     2.69     2.69  lockmgr
  2.42     66.25     2.84   228041    12.45    21.20  data_abort_handler
  2.03     68.63     2.38  2313370     1.03     1.12  raisespl
  1.82     70.77     2.14   459854     4.65    10.14  pmap_enter
  1.72     72.79     2.02                             SetCPSR
  1.68     74.76     1.97 34468917     0.06     0.06  cpufunc_nullop
  1.64     76.68     1.92   827958     2.32     2.32  pmap_vac_me_harder
  1.62     78.58     1.90    45560    41.70    41.70  bzero_page
  1.58     80.43     1.85   480131     3.85     3.85  uvm_pageactivate
  1.33     81.99     1.56    84942    18.37    18.37  memset
  1.30     83.52     1.53    91499    16.72    18.11  uvm_pagealloc_strat
  1.24     84.98     1.46   136958    10.66   491.69  syscall
  1.21     86.40     1.42    37614    37.75   162.21  pmap_remove
  1.08     87.67     1.27   768200     1.65     1.65  pmap_pte
  0.94     88.77     1.10   113404     9.70     9.70  copyout
  0.90     89.83     1.06   133692     7.93     7.93  sa110_cache_purgeD_rng
  0.78     90.74     0.91   311751     2.92     2.92  _memcpy
  0.76     91.63     0.89  1133114     0.79     1.22  pmap_extract
  0.74     92.50     0.87    87418     9.95     9.95  sa110_cache_purgeID_rng
  0.68     93.30     0.80    90467     8.84    26.76  uvm_pagefree
  0.67     94.09     0.79   423348     1.87     1.87  uvm_map_lookup_entry
  0.61     94.80     0.71    64469    11.01    20.61  genfs_getpages
  0.60     95.50     0.70  1640675     0.43     0.43  pmap_map_ptes
  0.53     96.12     0.62  2297041     0.27     0.28  dosoftints
  0.53     96.74     0.62     1891   327.87  1302.70  pmap_release
  0.50     97.33     0.59    67040     8.80    12.59  prefetch_abort_handler
  0.47     97.88     0.55   336049     1.64     2.76  pmap_remove_pv
  0.43     98.39     0.51   336938     1.51     2.63  pmap_enter_pv
  0.43     98.89     0.50    68225     7.33     7.33  copyoutstr
  0.43     99.39     0.50    42211    11.85   139.28  pmap_copy_page
  0.42     99.88     0.49    92611     5.29     9.02  malloc
  0.38    100.32     0.44   346933     1.27    14.54  uvm_pagelookup
  0.37    100.75     0.43   257360     1.67     2.79  pmap_modify_pv
  0.37    101.18     0.43   104720     4.11     4.12  pool_get
  0.36    101.60     0.42  4560145     0.09     0.09  irq_setmasks
  0.36    102.02     0.42    74394     5.65    13.43  cache_lookup
  0.35    102.43     0.41   141885     2.89     4.67  pmap_handled_emulation
  0.35    102.84     0.41    25434    16.12   559.63  lookup
  0.33    103.23     0.39   365387     1.07     1.07  userret
  0.31    103.59     0.36    86156     4.18    13.75  pmap_modified_emulation