Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: VAX addressing modes



>>>    5ca03:       cf 50 00 0a     casel r0,$0x0,$0xa
>>>    5ca07:       27 11 38 45     divp $0x11,$0x38,$0x14[r5],*0xffffd214(r2),*0x46
>>> (r0),*0x46(r0)
>> The disassembler is defective; it is failing to understand that
>> casel is followed by a jump table, not ordinary code.
> FYI, `objdump' uses symbol types rather than heuristics to determine
> whether to dump binary contents as code or data.  This is especially
> because you can request an arbitrary range to be dumped and looking
> back for any preceding instruction that could change interpretation
> is infeasible.

Same as for pretty much any architecture with variable-length
instructions.

> Therefore the correct solution is to insert a symbol of the data type
> (STT_OBJECT in ELF-speak) at the beginning of any text content that
> is supposed to be intepreted as data and then a symbol of the code
> type (STT_FUNC in ELF-speak) at the end (if applicable).  This has to
> be done by the compiler (or whoever has written handcoded assembly
> code).  Some GCC target backends do this already, the VAX one clearly
> does not, as shown by this example.

I think I disagree.

The jump table for a case instruction actually is code; I wrote too
briefly.  It is just unusual code.  But it is still code; it is fetched
from the instruction stream, cached in the I-cache (if the CPU has
one), is subject to all the restrictions on code (such as not coming
within 512 bytes of nonexistent memory) - the VAX is more von Neumann
than Harvard, but from a Harvard point of view the displacement table
comes from instruction memory.

Starting disassembly in the middle of a CASE jump table is no different
from starting disassembly in the middle of any other multi-byte
instruction.  I would not expect a disassembler to look backwards in
either case.

Disassembling the above starting at 5ca05, I get

         5ca05: halt
         5ca06: index   $27,$11,$38,$14[r5],*-2dec(r2),*46(r0)
         5ca11: movw    *(r5)+[r6],$12
         5ca15: tstb    $12
         5ca17: blss    0x5ca19
         5ca19: bgtr    0x5c9ed
         5ca1b: editpc  $31[r5],$6,$0,$17
         5ca21: pushab  0x5d09b
         5ca25: halt    
         5ca26: pushl   r10
         5ca28: calls   $1,*0x1a60ec

which is obvious nonsense in at least three superficial ways and a
whole lot of slightly deeper ways.  (The superficial ones I see:
indexed short literal mode at 5ca06 and 5ca1b and a MOVW to a short
literal at 5ca11.)  Would you expect a disassembler to look backwards
and figure out that 5ca05 is two bytes into a CASEL?  I wouldn't, no
more than I would expect disassembly at 5ca21 to recognize it as one
byte into a JMP rather than being a PUSHAB.  (Indeed, there probably is
someone twisted enough to have crafted code that is one sensible code
sequence when entered at one point and a different sensible code
sequence when entered at what from the first point of view is partway
through an instruction.)

In my opinion, the problem of disassembling CASE displacement tables is
essentially the same as the problem of disassembling any multi-byte
instruction: if you start at the correct place (the beginning of the
instruction), there is no excuse for getting it wrong; if you start at
an incorrect place (anywhere else), there is no reason to expect to get
it right.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index