Subject: Re: Loading DSP code from an LKM
To: None <bryanxms@ecst.csuchico.edu>
From: Chris Torek <torek@elf.eng.bsdi.com>
List: tech-kern
Date: 10/02/2001 12:35:41
>The issue is that, on PowerPC, you can't make a call outside of a
>certain address range in the short [fast] way ...

In fact, this is a generic problem on many architectures: branches
of any kind, including subroutine calls, may have a limited reach.
There may be multiple kinds of branches, "short" and "long" (which
is the source of the old joke about the 1802, "the long branch is
not a saloon"; of course the 1802 had some other interesting features,
such as the SEX instruction :-) ).

In some extreme (and obnoxious) cases (such as on the VAX), the
short branches are conditional and the longest branches are
exclusively unconditional, so for some code, you must branch around
a branch.  For the easy cases, the assembler can handle these for
you -- the VAX assemblers have pseudo-ops like "jlss" (blss if
short, bgeq around jmp if long); I think GNU "as" even has a
"jaobleq" and "jsobgtr" (which the old VAX assembler never had).
These are just a way to defer final instruction selection and code
generation until the branch displacements are known.

None of this works when the code has been fully assembled and
now needs to be linked instead.  Here the instructions are already
selected: if they turn out to be the wrong ones, you are stuck.

There are a number of possible approaches, including deferring
instruction selection and code-generation until link-time (needs
much fancier linkers), or including multiple alternative code
sections and "no-op'ing out" all but the desired ones, or including
"trampolines" (no real relation to the ones GCC uses for nested C
functions) so that short-calls can call to the trampolines that
contain long-calls (or jumps), or call directly to the target, as
needed.  This last method is probably the easiest to fit into the
existing linkers without giving up too much performance.

Chris