Subject: Re: StrongARM performance tweaks cpufunc_asm.S
To: Richard Earnshaw <rearnsha@arm.com>
From: Chris Gilbert <chris@paradox.demon.co.uk>
List: port-arm32
Date: 03/09/2001 22:43:48
On Friday 09 March 2001  3:26 pm, Richard Earnshaw wrote:
> > On Fri, Mar 09, 2001 at 02:51:45PM +0000, Chris Gilbert wrote:
> > > On Thursday 08 March 2001 11:15 am, Richard Earnshaw wrote:
> > > > Well the SA TRM definitely says that two banks aren't necessary iff
> > > > the memory is unused for any other purpose.  (maybe this was a hack
> > > > to work around not draining the write buffers properly :-)  I've been
> > > > using this code for ~6 months in my own kernel and not seen any ill
> > > > effects from it.
> > >
> > > just out of curiosity (and cos it's a mad idea :), would it be
> > > pheasible to use any constant block of 16k?  eg the first 16k of the
> > > kernel (or perhaps the 16k around some bit of code that would benefit?
> > > maybe use lr as the start of the 16k, we know we're about to run that
> > > code so why not just pre-cache it?))  just seems a waste to load in 16k
> > > of nothing if we could load 16k of code that we're likely to run.
> >
> > a) you'd have to round it to some value I'd have to look up
> > b) more important: you'd have to make sure it is NOT the address range
> > that you want to flush out of the cache in the first place.
>
> c) You'd be loading into the D$, not the I$. So it wouldn't help you very
> much :-)

Must be some useful data we could load up into memory though?  Can't think 
what though, any big kernel data tables? maybe the kernel L1 tlb table, or 
does that change too much/shouldn't be cached?  Just feels like such a waste 
to pull 16k across that slow bus just to chuck it away.

Cheers,
Chris