Subject: Re: Xscale optimisations
To: David Laight <david@l8s.co.uk>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 10/14/2003 15:40:51
rearnsha@arm.com said:
>  Actually, the DNARD PAL comments suggest it's more complicated than
> that:  AFAICT a cache line fill will take 14 clock ticks and a line
> write 12  clocks.  8 individual stores could take as many as 56
> clocks, so there  would be a clear win to pre-fetching the line
> (potentially a factor 4  performance improvement).

Doh! 56 / (14 + 12) ~= 2 not 4.   Still quite a potential win, 
particularly for bzero.

R.