Subject: Re: Port of NetBSD to XScale
To: Chris Gilbert <chris@paradox.demon.co.uk>
From: Reinoud Zandijk <imago@kabel065011.kabel.utwente.nl>
List: port-arm32
Date: 03/29/2001 16:14:40
Hiya Chris,

On Thu, 29 Mar 2001, Chris Gilbert wrote:
> Branching looks to be worse than ever at 4 cycle for a branch miss, or 0 if
> it's predicted by the branch prediction buffer, it doesn't see the standard
> MOV PC, LR to return method, I suspect that doing B LR will help it there.

Oh but it does !! ... look in table 14-5. the `MOV' is a dataprocessing
operation :)

A `MOV pc, lr' takes 5 cycles (4+1) ... see table 14-5 and 14-6.

> In fact looking at the timing of things this:
> LDR R14, [R2] (takes 1 issue cycle and stall 3 for the data result)
> B R14	    (takes 1 issue cycle if predicted, or 5 if not)

This is an illegal instruction ... a `B' can only branch by a fixed
literal of bytes ... not a register....

> could be faster than:
> LDR PC, R2  (takes minumum of 8 issue cycles)

Yep.. takes 8 cycles :(( but at 750 Mhz ... so about a `3' in terms of SA1
speed :(

> some potentially useful things look to be the ability to lock memory into the
> instruction cache, eg we could lock the 0 page vectors into the cache.

That would be quite expensive :( ... but i would suggest the task
switching and low level IRQ + SWI code to be a candidates too ...

> Of course the xscale looks to be clocked at near twice the speed of the SA,
> so it probably will seem slightly faster :)

3*233 = 466+233 = 699 ... and still no 750 :)

Cheers,
Reinoud