Subject: pmap_{zero|copy}_page() re-engineering
To: None <port-sgimips@netbsd.org>
From: Toru Nishimura <locore32@gaea.ocn.ne.jp>
List: port-sgimips
Date: 01/09/2003 00:56:14
I propose here design change for pmap_zero_page() and
pmap_copy_page().

The fundametal issue is the code we have now is suffered from
original R3000 cache design.  It is physical address indexed cache,
which means there is little to worry about processor actually
handles VA to run.  R4000 made a big stride in cache design which
features virtual address index but physical address tag.  R4000 cache
operates in writeback way which is acutely different from R3000
writethru cache.

Now, pmap_zero_page() should be;

1. map the page at a certain reserved VA range in _writethru_
mode. The page is going to be ZFOD (zero fill on demand) store
and expected to be the first time use during the process life time.
2. clear the reserved VA range.  I think there is no need to
invalidate the range after writing since cache line will be just
left clean state.
4. TBIS() the range for safe.

It's somehow groomy  pmap_copy_page() has no way to
know the VAs of src/dst pages.  I guess it is used when
COW condition is about to be dissolved.  I assume here src page
is in "clean" state since it's natual to make the COW VA range
writebacked (sync'ed with memory) when the range is marked
COW.

1. map src page at certain reserved VA range.  The range might
possibly conflict the orignal VA of the page.
2. map dst page at another VA range in _writethru_ mode.  The
range might possibly conflict the target VA of the page.
3. copy src content to dst thru the VA windows.
4. Invalidate (throw away) the reseved src VA range and dst VA
range.
5. TBIS() both for safe.
** I'm concerned with small cache less than 8KB and no good idea
for now.

I believe in it will work, but feel free to criticize if my idea is
wrong.
Points should be taken into count;
- VCE might be posted by L2-equipped R4000/4400 since
L2 cache line size is larger than L1 line size and the processor
can detect which VA range L2 cache line is "bound" with.
- There are two different designs of external cache.  R4000/4400
and R1x000 have dedicated L2 cache path while R5000/RM5200
cache works conjuction with SysAD bus.
- L2 cache of R5000/RM5200 has the same line size of L1.  This
means L2 cache line, which is physical address indexed, needs
to worry about the cache line content is "teared apart" to distinct
L1 cache lines.

Toru Nishimura/ALKYL Technology/www.alkyltechnology.com