Subject: Re: Xscale optimisations
To: David Laight <firstname.lastname@example.org>
From: Richard Earnshaw <email@example.com>
Date: 10/14/2003 15:40:51
> Actually, the DNARD PAL comments suggest it's more complicated than
> that: AFAICT a cache line fill will take 14 clock ticks and a line
> write 12 clocks. 8 individual stores could take as many as 56
> clocks, so there would be a clear win to pre-fetching the line
> (potentially a factor 4 performance improvement).
Doh! 56 / (14 + 12) ~= 2 not 4. Still quite a potential win,
particularly for bzero.