Subject: Re: query on timing of some asm
To: Chris Gilbert <chris@paradox.demon.co.uk>
From: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
List: port-arm
Date: 04/10/2001 21:13:47
> Hi,
> 
> I'm just wanting to clarify something to do with load delays, and conditional 
> execution, as I understand it and ldr instruction requires 2 cycles for the 
> value to be available or it stalls, also cond exec still uses up 1 cycle even 
> if the instruction isn't actually executed, so in theory:
> 
> (code is from iomd_irq.S)
> This:
> 	ldr	r6, [r7, r9, lsl #2]	/* Get address of first handler structure */
>         ldr	r4, Lcnt		/* Stat info A */
> 
> 	teq	r6, #0x00000000		/* Do we have a handler */
> 	moveq	r0, r8			/* IRQ requests as arg 0 */
> 	
> will stall one cycle waiting for r6 to fill?

On Xscale it will stall for one cycle.  On all other ARMs to date it will 
execute without stalling (provided that both ldr instructions hit cache 
entries -- ARM10 has hit under miss, but it is the only one that does so 
far).

> 
> so does this mean that:
> 	ldr	r6, [r7, r9, lsl #2]	/* Get address of first handler structure */
>         ldr	r4, Lcnt		/* Stat info A */
> 
> 	mov	r0, r8			/* IRQ requests as arg 0 */
> 	teq	r6, #0x00000000		/* Do we have a handler */
> 
> where the value of r0 only matters if r6 == NULL, it's overwritten elsewhere 
> if r6 != NULL.
> 
> would actually save 1 cycle?  Or does this turn out to depend on the 
> processor?

It depends on the processor.  You above chanage will only improve things 
for XScale.


Tweaking assembly files is OK, provided it doesn't obscure meaning too 
much.  Generic assembly files like these are a compromise between 
performance and clarity for all supported systems.  It's not a good idea, 
for example, to have to save more registers than necessary just to avoid 
stalls (unless in a very tight loop).  Remember that older ARMs (ie up to 
and including ARM 7) see no benefit from these re-arrangments -- and 
scheduling to avoid stalling the ARM7 write buffer is a completely 
different art.

R.