Subject: Re: cpu_switch (was Re: 1.5 Release documentation ...)
To: Neil A. Carson <neil@causality.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm32
Date: 11/09/2000 09:40:06
[Sorry Neil, I hadn't intended this message to be a private reply]

> Multiple mappings within the same process do happen...


I didn't intend to imply that they couldn't; but if they do, then they 
will either be mapped read-only & cacheable, or they will be mapped 
read/write & uncached (pmap_vac_me_harder).  For the former, then if it is 
the source of a copy, we can again avoid the cache flush (at least in 
theory -- IIRC the pmap will sometimes mark a page read-only even though 
it's cache entry is still dirty -- this caused random failures when I 
tried it :-(  ).  For the latter, since it is already uncached, there is 
no need to flush the cache for that case.

The net result is that there are only a miniscule number of cases where we 
need to flush the entire cache (and I'm writing this now on the machine 
which has the modifications made to prove it :-)

The overall effect when running a configure of gnu make (a useful test, 
since it is a shell script that forks a very large number of short-running 
processes) was that with the changes I had made we reduced the number of 
entire cache flushes from ~24000 to ~18000; and halved the number of calls 
to splx (2,000,000 -> 1,000,000).  Of the remaining cache flushes, the 
breakdown was as follows:

                0.02    0.00      44/17880       pmap_clean_page [89]
                0.71    0.00    1719/17880       switch_exit [141]
                3.17    0.00    7675/17880       pmap_remove [15]
                3.49    0.00    8442/17880       cpu_switch <cycle 3> [32]
[17]     6.8    7.39    0.00   17880         sa110_cache_purgeID [17]

showing that finding a way to fix pmap_remove could be very beneficial.

R.