port-mips: Re: more on cache ops... What are they actually supposed to do?

Subject: Re: more on cache ops... What are they actually supposed to do?
To: Chris G. Demetriou <cgd@sibyte.com>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: port-mips
Date: 06/19/2000 19:59:09
On 19 Jun 2000, Chris G. Demetriou wrote:

> There seems to be some confusion in the cache op code about what the
> functions are supposed to do, and how they're supposed to do them.
> (At the very least, the lack of adequate documentation for them is
> causing _me_ confusion. 8-)

  It is very confusing, and they don't really do what you think they
might mean on the MIPS3 machines.

  I believe the problem stems from trying to make the same cache ops do
the same thing for MIPS1 and MIPS3, when they really can't.  I think that
in the beginning, only the MIPS1 was supported on the pmax, and the cache
ops and their usage was tailored to the physically-index cache used on the
R2000/R3000 processors.  When the pica support was added, that port had
it's own locore.S routines with it's own cache ops.  I don't think the
pica had secondary cache, so it only needed to deal with the primary
cache.  The cache ops appear to have been used the same was as they were
for the MIPS1, but "adjusted" to flush the appropriate cache lines.  When
I added the MIPS3 support for the DECstations, I merged the MIPS3 support
from the pica pmap.c into the pmax pmap.c [there was not arch/mips at that
time].  Jonathan and I merged the MIPS1 and MIPS3 locore code into a
single locore, and then I had to deal with trying to get the DECstation
MIPS3 support to work with the secondary cache, which made the cache ops
even worse.  Since then, things have changed trying support a number of
different variants of the MIPS processors, which I haven't really followed
all that much.

> * the scary bit of code in pmap.c (around line 2136):

  It is rather scary, isn't it?

> 
>         if (CPUISMIPS3 && last != 0) {
>                 MachFlushDCache(va, PAGE_SIZE); 
>                 if (mips_L2CachePresent)
>                         /*
>                          * mips3_MachFlushDCache() converts the address to a
>                          * KSEG0 address, and won't properly flush the Level 2
>                          * cache.  Do another flush using the physical adddress
>                          * to make sure the proper secondary cache lines are 
>                          * flushed.  Ugh!
>                          */
>                         MachFlushDCache(pa, PAGE_SIZE);
>         }
> 
>   this shows that _some_ assumption is very, very bogus.  In
>   particular, I'd guess that the root cause of the problem here is that
>  the use of index ops to do range flushes is ... very, very broken.

  This "works" because the virtual address passed to FlushDCache() is
converted to KSEG0 address using the index bits.  This produces a valid
address which will flush the primary cache lines.  If secondary cache is
present, the cache flush using the VA won't flush the correct secondary
cache lines, since the KSEG0 address won't map to the read physical
address.  The second cache flush is done using the physical address,
which is also converted to a KSEG0 address using the physical address as
the index.  This is a "brue-force" method, which has the nasty side effect
of flushing much more cache than is really needed.

    The correct way is to use the "hit" cache ops so that only the desired
cache lines get flushed proprly.  The problem with this is that this
requires a valid mapping to be present.  I attempmted to do this once upon
a time, but did not have much success at the time.  If the VA is a user
address, it requires the proper ASID to match the VA, and that the TLB
miss handler uses the proper TLB table.  [One think I'm unclear on is if
the cache op can take a TLB miss, or if the TLB has to be valid to start
with.]

> * code sequences (like pmap.c:983) which look like:
> 
>                 MachFlushDCache(va, len);
>                 MachFlushICache(MIPS_PHYS_TO_KSEG0(va &
>                     (mips_L1ICacheSize - 1)), len);
> 
>   (when there's no substantive difference in the descriptions of those
>   two functions' arguments).  This goes back to comments in the first point
>   above.

  I think I'm responsible for that, and I can't remember my reasoning for
it at the time.  [I really should have documented what I was trying to
accomplish at the time.]  I'll need to cogitate on this a bit, but I think
it's to make sure the desired primary and secondary cache lines get
flushed.

> * If i understand things correctly, reading See Mips Run, it looks
>   like the way the mips1 cache ops are invoked may be broken (they seem
>   to operate on virtual addresses, but See Mips Run says that R2k/R3k
>   caches are physically addressed).

  Hmm, as I recall (but it's been quite a while since I've had to think
about this), the MIPS1 cache was physically-indexed and physically tagged.
[My limited documentatin is at work someplace, so I can't check to be sure
right now.]  As best I can see, the only places the cache is flushed on
the MIPS1 does use the physical address (converted to KSEG0 because it has
to be a valid address).

> * Also, again going by documentation that I have at hand, whether the
>   effective address passed to an 'index' cache op is evaluated by the
>   TLB on its way to becoming the index used to actually address the
>   cache, is actually CPU-dependent.  I.e. the apparent assumption in
>   mips3_Flush[ID]Cache that the cache is virtually indexed is ... not
>   correct.

  So there are MIPS3 (or higher) processors that don't use virtually
indexed cache?  At the time I did the DECstation support, the only
processes I knew anything about were the 4000 and 4400.

> My understanding of the current code is that it assumes that the caches are
> virtually indexed and tagged, and that index-based cache ops don't have the
> TLB applied to the EA before it's converted to an index.

  The current MIPS3 cache ops do assume the primary caches are
virtually-indexed, physically tagged and the secondary cache is physically
indexed, physically tagged.  The indexed ops convert the passed address to
a KSEG0 address using the appropriate bits of the address to index the
desired cache lines.  Using the KSEG0 address is to get a cached address
that doesn't require a valid TLB mapping.

> It seems to me that the right thing is to _only_ use the index ops for
> whole-cache operations (i.e., only FlushCache), and to ues the hit ops
> for operations which are to affect only certain (virtual) addresses.
> (And, to go along with this, if my understanding of the mips1 caches
> is correct, their use needs to be cleaned up as well.)

  I'd say that's the right way to do it, but I'm not sure how you would
ensure the TLB translation is valid.  I have though about maybe having a
cache flush routine that would take a virtual address and physical address
and use a wired TLB entry to ensure a valid TLB entry.

  I've also alway thought there should be some cache invalidate operations
for those cases where you don't need the cache flushed to memory.

> Can anybody comment on this?
> 
> 
> At minimum, somebody who understands this code needs to completely and
> adequately document the existing behaviour required from the
> functions, along the lines of functional specifications.  This would
> allow people to check the existing functions for correctness, and to
> implement new ones as appropriate...

  I'm not sure if anyone else understands this better than me (and I seem
to alway get confused about this every time I have to think about it), so
I supposed I must be that "somebody".

--
Michael L. Hitch			mhitch@montana.edu
Computer Consultant
Information Technology Center
Montana State University	Bozeman, MT	USA