Subject: RE: Instruction question: bbXX .vs. insv
To: 'Matt Thomas' <matt@3am-software.com>
From: Antonio Carlini <arcarlini@iee.org>
List: port-vax
Date: 07/18/2003 17:58:53
> In a number of places, NetBSD/vax uses the construct of
>=20
> bb{sc,cc,cs,cc} <BITNO>,<DST>,label
> label:
For CVAX, the timings are like this:
Work out the timings for each operand (bearing in mind that the
last oeprand is different) and add on the timing for the instruction
itself.
<BITNO> and <DST> will take the same number of "operand cycles" for
either form; best case is 1 microcycle each (for short literal or
register
access).
BBxx has <label> as the final operand and that costs 1 microcycle plus
1 read cycle, which is listed as 1+1r in the table. If the opcode and
the first specifier of the instruction at the branch target overlap,
throw in one extra microcycle. So here you are likely to have a
cost so far of 2+1r.
For a register DST the following additional costs apply:
(no branch): 6 (branch): 8 + 1r
For a memory DST it becomes
(no branch): 7 + 1r + 1w
(branch): 7 + 2r + 1w
So the total cost for BBxx with register DST and no overlap label=20
and no branch: 9 + 1r=20
For a BBxx with memory DST and an overlapping label and a taken
branch: 11 + 3r + 1w.
(Plus the cost of the two operands that were omitted above as common).
> to set or clear individual bits. Now that got me to=20
> wondering whether doing the following is an improvement:
>=20
> insv {$1,$0},<BITNO>,$1,<DST>
INSV has two extra arguments: assuming a short literal for the
first, that's 1 microcycle. Same goes for the second. But DST
might actually count as 0 here as last operand if it is a
register. So, for simplicity, count the overall cost of
the two args as 1+1-1 =3D 1.
INSV is listed as 10-12 with 10 typical for register operand
INSV is listed as 13 + 1r + 1w =3D> 15 + 2r + 3w for memory
(with the shorter time being typical).
So for CVAX, assuming I've counted right (and understood the
timings in the first place :-)) then BBxx looks faster for
registers/branch-not-taken. Beyond that, it's the savings=20
seem to drop away.
All of this is moot anyway, since the cache will make a huge difference
in all of this (except the cycle times I guess).
I have a 78032 (uVAX II) manual somewhere that probably lists timings
and they're probably simpler than the CVAX ones. I also have an NVAX
(or NVAX+) spec somewhere that *may* list such things but the timings
are probably even less useful for that case. Assuming a find these
manuals, I'll see what I can cook up.
I assume this is some frequently hit code that you wnt to change?
Antonio
=20
--=20
---------------
Antonio Carlini arcarlini@iee.org