Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ARM Cortex-A72 slow multiply (MADD) instruction execution



On Wed, 15 Apr 2020 13:20:18 +0100
Sad Clouds <cryintothebluesky%gmail.com@localhost> wrote:

> I was a bit surprised to find that on this hardware int64 multiply
> instruction (MADD) is rather slow, compared to int32, float and
> double.

Somebody on the ARM forums pointed me to the instruction characteristics
document for this CPU:
https://static.docs.arm.com/uan0016/a/cortex_a72_software_optimization_guide_external.pdf

Looks like int64 multiplication (MADD with Xn registers) is 1/3rd the
throughput of int32, float and double. It seems this is how the
hardware was designed and implemented. It's quite interesting, as I
never realised this was the case for aarch64.


Home | Main Index | Thread Index | Old Index