Subject: Re: RFC: Change SWI number base?
To: None <Richard.Earnshaw@arm.com>
From: David Laight <David.Laight@btinternet.com>
Date: 01/09/2002 12:15:16
> > The 'pc' value when the 'bx pc' is done MUST be a multiple of 4.
> > Apparantly (inspite of what the ARM ARM may have said) some cpus
> > don't ignore bit 1 of the pc when doing pc relative loads - so
> > find all your constants rotated by 16 bits :-)
> Eh? Can you cite examples?
The following has a 50% chance of success (on some cpus):
ldr r0,=0x12345 /* [pc,#nn] */
if go_32 is at address 4n+2, code_32 will be 4n+8, the 'pc' when the
'bx pc' is execute will be '.+4' or 4n+6. In 32 bit mode bits 0 and 1
of the pc are ignored when fetching instructions, so the first fetch
is the the thumb 'nop' and the pad! - hopefully a nop. But the pc used
in the pc-relative load will have bit 1 set.....
> > If (many) of the syscall hooks are in one file, making the final sequence:
> > swi nnn
> > bxcc lt
> > b __go_cerror
> > __go_cerror:
> > ldr r12,=__cerror
> > bx r12
> > saves a few bytes and makes each hook 16 bytes - so they fit niceley
> > into cache lines.
> But costs an extra non-predictable branch (very expensive on XScale). I'm
> not suggesting that we should look at something like this, but the cost
> has to be borne in mind.
These sort of costs are very difficult to quantify!
However I don't think that any optimisation of the syscall fail path
will be a gain if it lengthens to sucess path.
> > Now work out the optimal order for the hooks, then
> > get the .balign 32 to work (it doesn't in the arm a.out build I've used).
> a.out object files on ARM only maintain the sections to 4-byte alignment,
> so will ignore attempts to force greater alignment. The linker simply
> concatenates each similar section ensuring that they start on a 4-byte
What I guessed - made the 32byte alignment of the code tables for the java
byte code interpreter and the integer divide routine somewhat sub-optimal!