Subject: [rfc] a new read-only plt format
To: None <port-alpha@netbsd.org>
From: Richard Henderson <rth@twiddle.net>
List: port-alpha
Date: 06/01/2005 10:13:45
I've recently created a new plt format for Alpha.  The main feature of
this new format is that the plt code is read-only.  This was a requirement
for SELinux, which does not allow executable pages to become writable,
or writable pages to become executable.  As side effects, the plt table
itself is smaller, allows more entries than before, and requires saving
fewer registers.

Support for this new plt format is in binutils head.  I'm writing to the
other free alpha os'es so that they can experiment with the new format
and if there are tweaks that need to be made, they can be made right away
before it gets deployed to real users.

The dynamic linker receives the following information:

If the new format is in use, a _DYNAMIC tag DT_ALPHA_PLTRO exits.

The DT_PLTGOT tag points to the .got.plt section.  This section is
16 bytes long.  The first word should be filled in with the runtime
resolution entry point; the second word should be filled in with the
dynamic linker cookie.  This is the same information that was placed
in words 3 and 4 of the plt in the old format.

On entry to the runtime resolution entry point, $27 contains the 
address of the entry point, $28 contains the dynamic linker cookie,
and $25 contains the offset of relocation entry in .rela.plt.

When resolving the relocation, the *only* thing that needs to happen
is to write to the word indicated by the relocation.  The plt should
not be modified.  Obviously.

A sample entry point routine is appended.

One complication of this is that we now clobber more registers than
we did before, which means that the division routines cannot go through
the plt anymore.  There are two things that are done to make this work.

First, libc should be modified such that the division routines are *not*
marked STT_FUNC.  The linker only creates plt entries for things marked
functions.  With this change, all existing object files are correctly
handled, *provided* that they are in fact linked against libc eventually.
This is by far the common case.

Marking the division routines as something other than STT_FUNC is done
by *not* using the .ent/.end markers for the division routines.  Instead,
you'll want to write

	.globl	__divq
	.type	__divq,@notype
	.usepv	__divq,no
__divq:
	...
	.size	__divq,.-__divq

In addition, you'll want to use the .cfi_* directives to write unwind
info appropriate to your implementation so that the debugger can 
properly step over your routine.  This is slightly less convenient than
the .ent/.end markers, but in my case .frame wasn't adequate to properly
describe the unwinding and I'd already been using .cfi directives.

Second, a new relocation marker has been added, !lituse_jsrdirect.  This
relocation is similar to !lituse_jsr, except that it prevents the linker
from creating a plt entry for the symbol.  Top-of-branch of gcc 3.4, 
4.0, and 4.1 have been modified to make use of this relocation when 
invoking the division routines.  I dunno how much in the way of hand-coded
assembler you have that might make use of general division; probably none
at all.

Comments?  Questions?



r~



/* void * _dl_fixup (void *cookie, long reloc_offset, void *caller) */

#define FRAMESIZE       14*8

	.align  4
        .globl  _dl_runtime_resolve_new
        .ent    _dl_runtime_resolve_new
_dl_runtime_resolve:
        .frame  $30, FRAMESIZE, $26, 0
        .mask   0x4000000, 0

        ldah    $29, 0($27)             !gpdisp!1
        lda     $30, -FRAMESIZE($30)
        stq     $26, 0*8($30)
        stq     $16, 2*8($30)

        stq     $17, 3*8($30)
        lda     $29, 0($29)             !gpdisp!1
        stq     $18, 4*8($30)
        mov     $28, $16                /* link_map from .got.plt */

        stq     $19, 5*8($30)
        mov     $25, $17                /* offset of reloc entry */
        stq     $20, 6*8($30)
        mov     $26, $18                /* return address */

        stq     $21, 7*8($30)
        stt     $f16, 8*8($30)
        stt     $f17, 9*8($30)
        stt     $f18, 10*8($30)

        stt     $f19, 11*8($30)
        stt     $f20, 12*8($30)
        stt     $f21, 13*8($30)
        .prologue 2

        bsr     $26, _dl_fixup          !samegp
        mov     $0, $27

        ldq     $26, 0*8($30)
        ldq     $16, 2*8($30)
        ldq     $17, 3*8($30)
        ldq     $18, 4*8($30)
        ldq     $19, 5*8($30)
        ldq     $20, 6*8($30)
        ldq     $21, 7*8($30)
        ldt     $f16, 8*8($30)
        ldt     $f17, 9*8($30)
        ldt     $f18, 10*8($30)
        ldt     $f19, 11*8($30)
        ldt     $f20, 12*8($30)
        ldt     $f21, 13*8($30)
        lda     $30, FRAMESIZE($30)
        jmp     $31, ($27), 0
        .end    _dl_runtime_resolve