Subject: Cacheing parts of podule space
To: None <port-arm32@netbsd.org>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm32
Date: 09/09/1999 10:05:33
First some background.
I finally got fed up with the sucky performance of my Acorn AKA-31 SCSI
card, so I've re-written the driver for it to make use of the on-board
buffer memory that can be used for DMA to the SCSI bus. Now, instead of
the truly sucking 100K/s or thereabouts that I used to get, I now get
about 900K/s with much lower load on the machine.
However, I've hit a brick wall; the bottleneck now seems to be the time
taken to transfer the data to the buffer memory. The podule-space is
mapped uncached (sounds reasonable, you say), but on a StrongARM this
means that ldm/stm transfers are not buffered or streamed, so the hardware
in effect breaks out each load/store in the instruction into a separate
bus transaction, which probably means that the throughput to the buffer
memory is divided by at least 2 and probably 3 (I forget the details).
Ouch! Further, these cycles are all running at the podule bus speed,
again I forget the numbers, but that's something like 8MHz.
So, to the question. Is there a way to map just one page of podule-space
(the page where the buffer memory is mapped) to be cached/bufferable? I
really think that on a strongarm this will be a sufficient win to make
syncing the cache during such transfers a price worth paying.
Richard.
I'll make the code to this available when I've tidied up a few things.