Subject: Re: EGCS enabled on mips
To: Todd Whitesel <toddpw@best.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-toolchain
Date: 10/31/1998 23:11:09
>What if there was an alternative?
>Add a third reloc called "hiadj" or "hi1" which looks at the low 16 bits of
>the expression, and compensates for them. For example (pardon my asm, it's
>been a while):
>
> lui $1, %hi(expression)
> ori $1, $1, %lo(expression)
>
> lui $2, %hiadj(expression)
> ld $2, $2, %lo(expression)
But %hiadj is what the current %hi reloc means: the HI16 and LO16
relocs are defined such that the low-order offset is signed.
>If the linker has enough info to know whether or not carry analysis is
>required, then the pairing requirement isn't needed.
Its always required, at least potentialy, the way mips symbol-refs
(eg, "la $2, <external_symbol>" work now, because the low-order offset
is signed. if it was unsigned, then the load of the upper and lower 16
bits would be disjoint (ordered, but disjoint -- no carries) and this
particular problem would go away. (is that what you meant to suggest?)
heres the real problem case, using mips syntax (%lo means signed
16-bit offset, which is sign-extended to 32-bits, so %hi needs carry
compensation):
lui $1, %hi (expr1) # ..carry compensation needed
addiu $1, %lo(expr1) # NB: %lo is sign extended
lui $2, %hi (expr2)
addiu $2, %lo(expr2)
and haifa scheduling can (from what warner says) potentially turn it
into something like:
lui $1, %hi (expr1)
lui $2, %hi (expr2)
addiu $1, %lo(expr1)
addiu $2, %lo(expr2)
(which is oversimplified, but it captures what i think Warner Losh is
reporting the LInux/mips people ran into.)
is that a clearer explanation?
here's a couple of the relevant comment from elf32-mips.c. first from
just above _bfd_mips_elf_hi16_reloc(),
/* Do a R_MIPS_HI16 relocation. This has to be done in combination
[... ...] */
and second from inside the body of _bfd_mips_elf_lo16_reloc()
/* The low order 16 bits are always treated as a signed
[ ... ..] */
which confirm my understanding.
>This reloc technique is already in use by toolchains from Green Hills,
>GNU toolchains distributed through Wind River, and probably others.
You mean Greenhils and Wind River use an extension with unsigned
low-order offsets? That would make life easier.
But the obvious downside is that the no-carry way means more expensive
code. To compute an array reference, say
ld $t1, $a0+<external-symbol>
using HI16U and LO16U takes at least four instructions
lui tem, %hiu(<exterenal-symbol>)
ori tem, %lou( <exterenal-symbol>)
addu tem, tem, a0
ld $t1, 0(tem)
whereas with signed offsets, you can combine the low-order 16-bit
offest and the add, and do it in three.
You could get around both by always using the assembler `la'. but
since you're using haifa precisely beacuse thats the kind of code you
want to do CSE on and schedule better... there's no win, really.
The approach I prefer is introduce a new fictitious register to hold
the carry. I'd add a pair of patterns for symbol-refs, one for "lui
reg, %hi": and one for the "addiu reg, %lo(reg2).
the new lui pattern emits RTL which stores the carry of the low-order
part into the new special register. And the RTL of the corresponding
low-order 16-bit pattern adds in the magic register to the 16-bit
offset
That'd mean EGCS can never interleave one lui %hi/addiu %lo with
another, because doing so would clobber the magic register. (the fact
that the assembler does it more-or-less the other way round is
immaterial: if they're paired, they're paired, it doesnt matter
which way the dependency points as long as theyre all consistent.)
Sound reasonable?