Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/crypto/external/bsd/openssl/dist/crypto/bn/asm openssl: Remo...



details:   https://anonhg.NetBSD.org/src/rev/b4ec2c55b5f5
branches:  trunk
changeset: 374060:b4ec2c55b5f5
user:      riastradh <riastradh%NetBSD.org@localhost>
date:      Wed Mar 29 13:07:46 2023 +0000

description:
openssl: Remove local micro-optimization on AMD (but not Intel).

Upstream OpenSSL changed

        loop 1b

to

        dec %rcx
        jnz 1b

which has mostly the same semantics, in this change:

https://github.com/openssl/openssl/pull/4743

For some reason, in one of the OpenSSL updates, we ended up with a
local change to revert this.

The Intel and AMD optimization guides are silent on the LOOP
instruction, but Agner Fog's tables shows that while LOOP is one
cycle shorter than DEC;JNZ on AMD Zen microarchitectures, it is a
good half dozen cycles longer than DEC;JNZ on recent Intel
microarchitectures.

The history of the OpenSSL change suggests it was intended, and I
can't find any indication other than `merge conflicts' that we
intended to keep the LOOP version.  So let's reduce the local diff by
nixing it.

diffstat:

 crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c |  14 +++++++-----
 1 files changed, 8 insertions(+), 6 deletions(-)

diffs (31 lines):

diff -r 33b0fc247f8f -r b4ec2c55b5f5 crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c
--- a/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c       Wed Mar 29 13:01:44 2023 +0000
+++ b/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c       Wed Mar 29 13:07:46 2023 +0000
@@ -219,9 +219,10 @@ BN_ULONG bn_add_words(BN_ULONG *rp, cons
                   "       adcq    (%5,%2,8),%0    \n"
                   "       movq    %0,(%3,%2,8)    \n"
                   "       lea     1(%2),%2        \n"
-                  "       loop    1b              \n"
-                  "       sbbq    %0,%0           \n":"=&r" (ret), "+c"(n),
-                  "+r"(i)
+                  "       dec     %1              \n"
+                  "       jnz     1b              \n"
+                  "       sbbq    %0,%0           \n"
+                  :"=&r" (ret), "+c"(n), "+r"(i)
                   :"r"(rp), "r"(ap), "r"(bp)
                   :"cc", "memory");
 
@@ -245,9 +246,10 @@ BN_ULONG bn_sub_words(BN_ULONG *rp, cons
                   "       sbbq    (%5,%2,8),%0    \n"
                   "       movq    %0,(%3,%2,8)    \n"
                   "       lea     1(%2),%2        \n"
-                  "       loop    1b              \n"
-                  "       sbbq    %0,%0           \n":"=&r" (ret), "+c"(n),
-                  "+r"(i)
+                  "       dec     %1              \n"
+                  "       jnz     1b              \n"
+                  "       sbbq    %0,%0           \n"
+                  :"=&r" (ret), "+c"(n), "+r"(i)
                   :"r"(rp), "r"(ap), "r"(bp)
                   :"cc", "memory");
 



Home | Main Index | Thread Index | Old Index