Subject: Re: EGCS enabled on mips
To: Todd Whitesel <toddpw@best.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-toolchain
Date: 10/31/1998 23:11:09
>What if there was an alternative?

>Add a third reloc called "hiadj" or "hi1" which looks at the low 16 bits of
>the expression, and compensates for them. For example (pardon my asm, it's
>been a while):
>
>        lui     $1, %hi(expression)
>        ori     $1, $1, %lo(expression)
>
>        lui     $2, %hiadj(expression)
>        ld      $2, $2, %lo(expression)

But %hiadj is what the current %hi reloc means: the HI16 and LO16
relocs are defined such that the low-order offset is signed.


>If the linker has enough info to know whether or not carry analysis is
>required, then the pairing requirement isn't needed.

Its always required, at least potentialy, the way mips symbol-refs
(eg, "la $2, <external_symbol>" work now, because the low-order offset
is signed. if it was unsigned, then the load of the upper and lower 16
bits would be disjoint (ordered, but disjoint -- no carries) and this
particular problem would go away. (is that what you meant to suggest?)

heres the real problem case, using mips syntax (%lo means signed
16-bit offset, which is sign-extended to 32-bits, so %hi needs carry
compensation):

	    lui	  $1,   %hi (expr1)		# ..carry compensation  needed
	    addiu  $1,   %lo(expr1)		# NB: %lo is sign extended

	    lui	  $2,   %hi (expr2)
	    addiu  $2,   %lo(expr2)

and haifa scheduling can (from what warner says) potentially turn it
into something like:

	    lui	  $1,   %hi (expr1)
	    lui	  $2,   %hi (expr2)

	    addiu  $1,   %lo(expr1)
	    addiu  $2,   %lo(expr2)

(which is oversimplified, but it captures what i think Warner Losh is
reporting the LInux/mips people ran into.)
is that a clearer explanation?

here's a couple of the relevant comment from elf32-mips.c.  first from
just above _bfd_mips_elf_hi16_reloc(),

     /* Do a R_MIPS_HI16 relocation.  This has to be done in combination
        [... ...] */

and second from inside the body of _bfd_mips_elf_lo16_reloc()
	  /* The low order 16 bits are always treated as a signed
	  [ ...  ..] */

which confirm my understanding.

>This reloc technique is already in use by toolchains from Green Hills,
>GNU toolchains distributed through Wind River, and probably others.

You mean Greenhils and Wind River use an extension with unsigned
low-order offsets?  That would make life easier. 

But the obvious downside is that the no-carry way means more expensive
code. To compute an array reference, say

    ld	$t1,	 $a0+<external-symbol>

using HI16U and LO16U takes at least four instructions

      lui   tem, %hiu(<exterenal-symbol>)
      ori   tem, %lou( <exterenal-symbol>)
      addu tem, tem, a0
      ld   $t1, 0(tem)

whereas with signed offsets, you can combine the low-order 16-bit
offest and the add, and do it in three.

You could get around both by always using the assembler `la'.  but
since you're using haifa precisely beacuse thats the kind of code you
want to do CSE on and schedule better... there's no win, really.


The approach I prefer is introduce a new fictitious register to hold
the carry. I'd add a pair of patterns for symbol-refs, one for "lui
reg, %hi": and one for the "addiu reg, %lo(reg2).

the new lui pattern emits RTL which stores the carry of the low-order
part into the new special register. And the RTL of the corresponding
low-order 16-bit pattern adds in the magic register to the 16-bit
offset

That'd mean EGCS can never interleave one lui %hi/addiu %lo with
another, because doing so would clobber the magic register.  (the fact
that the assembler does it more-or-less the other way round is
immaterial: if they're paired, they're paired, it doesnt matter
which way the dependency points as long as theyre all consistent.)

Sound reasonable?