Subject: Re: Cyberstorm (and Amiga cache)
To: None <amiga@NetBSD.ORG>
From: Jeffrey William Davis <c23jwd@eng.delcoelect.com>
List: amiga
Date: 03/25/1996 10:41:37
> Another question I have to ask again is that some people have feed back to
> me for cache on Amiga and still did not address the question I asked.
> First of all I realize 68040 has built-in L1 cache of 4k (8k ?). I have
> read various magazine in which it addresses not only lastest PowerMac have
> various L2 cache of at least 256k, but also lot of venders, like Sonnet,
> sell accelerators for Mac68k with 128K or 256K L2 cache built-in. And the 
> truth is it boosts the Mac68k. So I am just curious about L2 cache on Amiga.
> I don't know if there is zero wait time on WarpEngine. Since the high end
> workstation like Sun, Dec Alpha, all have large L2 cache on board. It is
> really hard to believe Amiga do not need L2 cache.
> 
> I really love my Amiga and like to have best performance from it. However,
> I have to say it is relatively slow comparing with my 486/100 PC running
> Linux. So where are the bottlenecks? IDE, CPU speed, memory access, video
> card, maybe lack of L2 cache.

...stuff deleted...

> I have asked lot of questions. Can anybody help?
> 
> cs_yus@cs4.lamar.edu

Here is the answer about L2 cache on the Amiga, and why it is not necessary
on accelerators like the Warp Engine.

First of all, a basic explanation of why L2 cache (or any external caching)
is necessary is to compensate for slow main memory access.  What I mean by
slow is the CPU can request memory accesses at a higher rate than the
RAM system can supply/accept them.  The cache is made up of high speed RAM,
typically fast enough to meet the CPU's demand, and essentially buffers
communication between the CPU and the main memory.  If the CPU needs data
that is present in the cache or writes data that can be stored in the
cache, it happens at nearly the maximum CPU bus speed.  If the data is
not in the cache, data transfer happens at the main RAM system speed or
maybe even slower.  The cache itself talks to the main RAM when it has
the time to store the cached data into RAM.  The speedup comes from
the cache predicting what the CPU will need and maximizing the chance
of requested data already appearing in the cache (a cache 'hit').

Now for the Amiga with a Warp Engine.  The external bus on the CPU runs
at a finite speed!  Clocked at 40MHz, the DRAM subsystem is able to process
all accesses immediately without any kind of delay (zero wait state).
An external cache would be completely useless for communicating with the
DRAM subsystem on the Warp, since there are no delays to compensate.

The only other possible use for a L2 cache on the Warp-Amiga would be for
talking to devices on the Zorro bus, and on the motherboard.  For example,
FAST RAM on the A4000 could benefit from L2 cache, but a better solution is
to simply move the memory to the Warp Engine.  Most other devices that
could benefit from a cache are non-cacheable anyways, like I/O devices,
custom chip registers, and a majority of valid CHIP RAM usage.  It would be
difficult to find something that could really benefit from an L2 cache in
this case.

The reason L2 cache is so prevalent in the PC realm (and others) is due
to the clock doubling, high CPU clock rates, and generally exceeding the
DRAM technology of today.  For example, a 166MHz bus speed on a system
with 3 clocks/access would require 18ns minimum speed RAM to run at full
speed.  If you allow it to support burst accesses, you would need 12ns RAM!
With typical DRAM being around 60ns, you've got to compensate for this
speed difference somewhere!  That's where the high speed cache memory
comes into play.  It can run at 10-20ns while the rest of the system
crawls along at 60ns.

If there were a low cost 10ns memory device, external caches would become
obsolete on single CPU systems.  And for the Warp Engine, an external cache
is completely unnecessary.  On the other hand, a Sonnet 50MHz 68040 doubler
in an otherwise stock A4000 would need some external caching to compensate
for the 25MHz (and pathetically slow) A4000 memory subsystem!

Just keep in mind that caches are not there to 'speed up' a system.  They
are there to reduce slowdowns and alleviate bottlenecks!

=======================================================================
Jeffrey W. Davis (317)451-0503   Domain: c23jwd@eng.delcoelect.com
Software Engineer                UUCP:   deaes!c23jwd
Delco Electronics Corporation    GM:     8-322-0503         Mail: CT40A
--- Never forget: 2 + 2 = 5 for extremely large values of 2.