Subject: That SPARC disassembler
To: None <>
From: der Mouse <mouse@Collatz.McRCIM.McGill.EDU>
List: port-sparc
Date: 12/08/1994 18:11:40
I started working with that SPARC disassembler, initially trying to
merge it into a multi-hardware disassembler framework I already have.
In the process, I found some peculiar things.

Four instructions (udiv, sdiv, udivcc, sdivcc) occur with an operand
signature of "12i".  But 2 prints the register whose number is in the
low 5 bits of the opcode, and i prints the immediate value in the low
13 bits of the opcode.  How can the same instruction use both 2 and i?

You print "nop" only for the equivalent of "sethi 0, %g0"; Sun adb also
prints nop for "or %g0, %g0, %g0" and probably others.

You have multiple entries that clash over the opcode.  For example,

  { (FORMAT2(0,2)|COND(1)|A(1)), "be,a", "m" },
  { (FORMAT2(0,2)|COND(1)|A(0)), "be", "m" },

  { (FORMAT2(0,2)|COND(1)|A(1)|P(1)), "be,a,pt", "m" },
  { (FORMAT2(0,2)|COND(1)|A(1)|P(0)), "be,a,pn", "m" },
  { (FORMAT2(0,2)|COND(1)|A(0)|P(1)), "be,pt", "m" },
  { (FORMAT2(0,2)|COND(1)|A(0)|P(0)), "be,pn", "m" },

There are two problems here.  First, the opcodes with P(0) are
identical to the non-predicting opcodes.  Second, the P() bit is one of
the bits used by the m signature!

Under some circumstances you ignore some bits in the opcode.  For
example, 0x88008001 and 0x88008f01 will both disassemble as
"add %g2, %g1, %g4".  Does the hardware actually ignore these bits?
Will it continue to do so forever?

You have duplicate opcodes.  How does one tell a tadd* from a tsub*?
  { (FORMAT3(2,0x21,1)), "taddcc", "1id" },
  { (FORMAT3(2,0x21,0)), "taddcc", "12d" },
  { (FORMAT3(2,0x23,1)), "taddcctv", "1id" },
  { (FORMAT3(2,0x23,0)), "taddcctv", "12d" },
  { (FORMAT3(2,0x21,1)), "tsubcc", "1id" },
  { (FORMAT3(2,0x21,0)), "tsubcc", "12d" },
  { (FORMAT3(2,0x23,1)), "tsubcctv", "1id" },
  { (FORMAT3(2,0x23,0)), "tsubcctv", "12d" },
Based on Sun's adb, I am assuming the tadd* instructions should have
the low bit of the second FORMAT3() argument cleared.

How does one tell a stbar from a membar?
  { (FORMAT3(2,0x28,1)|((0xf) << 14)), "membar", "9" },
  { (FORMAT3(2,0x28,0)|RS1(0xf)), "stbar", "" },
As far as I can tell those are identical, because the RS1(0xf) (which
is identical to 0xf<<14) will shift the 0x8 bit of its argument on top
of the bit position where FORMAT3 puts its third argument.

Again, apparently identical instructions:
  { (FORMAT3(2,0x2c,1)|COND(9)),  "movne", "0jd" },
  { (FORMAT3(2,0x2c,1)|COND(9)), "move", "ojd" },
What's more, the COND() bits are the same as four of the bits the d
signature character uses, making these look even more dubious.  The
opcode names are the same for some COND() values (eg, 8 -> mova) and
different for others (eg, 9 -> movne/move).

A set of six instructions like this, with different RCOND34()
arguments, suffer from a similar problem:
  { (FORMAT3(2,0x2f,1)|RCOND34(1)), "movrz", "1jd" },
The problem here is that the j signature letter uses the 0x7ff bits,
but RCOND34 shifts its argument left by 10, so the low bit of the
RCOND34 argument overlaps the high bit used by j.

  { (FORMAT3(2,0x30,1)), "wr", "1iH" },
  { (FORMAT3(2,0x30,1)|RD(0xf)), "sir", "i" },
This looks to me as though the opcodes that match second line are a
subset of those that match the first.

It also appears to me that there's no way for the disassembler to
generate the %psr, %wim, or %tbr register names under any
circumstances; the strings simply aren't there.  I thought this was
supposed to support current SPARCs...what am I missing?

					der Mouse