NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: lib/59329: NetBSD's OpenSSL 7x slower at AES than same upstream version on NetBSD



The following reply was made to PR lib/59329; it has been noted by GNATS.

From: RVP <rvp%SDF.ORG@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: lib/59329: NetBSD's OpenSSL 7x slower at AES than same upstream
 version on NetBSD
Date: Mon, 21 Apr 2025 07:02:15 +0000 (UTC)

 On Sat, 19 Apr 2025, nia%pkgsrc.org@localhost wrote:
 
 > When using OpenSSL's own build system, the following arch-specific
 > CFLAGS are applied to every file:
 >
 > -DAES_ASM -DBSAES_ASM -DCMLL_ASM -DECP_NISTZ256_ASM -DGHASH_ASM -DKECCAK1600_ASM -DMD5_ASM -DOPENSSL_BN_ASM_GF2m -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DPOLY1305_ASM -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DX25519_ASM -fPIC -pthread -Wa,--noexecstack -Wall -O3 -DL_ENDIAN -DOPENSSL_PIC
 >
 > On NetBSD it seems slightly harder to tell, since such flags
 > are applied on a per-file basis (error-prone when OpenSSL is
 > updated?).
 >
 
 Yeah, that's pretty much what this is. I managed to get a 10x speedup:
 
 ```
 $ openssl speed -evp aes-128-cbc
 Doing AES-128-CBC for 3s on 16 size blocks: 22702707 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 64 size blocks: 6494628 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 256 size blocks: 1684133 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 1024 size blocks: 429110 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 8192 size blocks: 53846 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 16384 size blocks: 26979 AES-128-CBC's in 3.00s
 version: 3.0.15
 NetBSD 10.99.14
 options: bn(64,64)
 gcc version 12.4.0 (NetBSD nb1 20240630)
 CPUINFO: OPENSSL_ia32cap=0x7ffaf3bfffebffff:0x40405f4ef2bf27ef
 The 'numbers' are in 1000s of bytes per second processed.
 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
 AES-128-CBC     121121.48k   138552.06k   143664.79k   146518.39k   147133.57k   147390.44k
 
 $ /tmp/obj/usr/src/crypto/external/bsd/openssl/bin/openssl speed -evp aes-128-cbc
 Doing AES-128-CBC for 3s on 16 size blocks: 228551884 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 64 size blocks: 73155615 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 256 size blocks: 18991561 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 1024 size blocks: 4788704 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 8192 size blocks: 599986 AES-128-CBC's in 3.00s
 Doing AES-128-CBC for 3s on 16384 size blocks: 300192 AES-128-CBC's in 3.00s
 version: 3.0.15
 NetBSD 10.99.14
 options: bn(64,64)
 gcc version 12.4.0 (NetBSD nb1 20240630)
 CPUINFO: OPENSSL_ia32cap=0x7ffaf3bfffebffff:0x40405f4ef2bf27ef
 The 'numbers' are in 1000s of bytes per second processed.
 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
 AES-128-CBC    1220163.54k  1560653.12k  1622235.44k  1634544.30k  1639454.74k  1639448.58k
 
 $
 ```
 
 using the following patch (only for AES):
 
 ```
 diff -urN a/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/aes.inc b/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/aes.inc
 --- a/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/aes.inc	2018-02-08 21:57:24.000000000 +0000
 +++ b/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/aes.inc	2025-04-21 06:26:06.653418871 +0000
 @@ -9,5 +9,6 @@
   vpaes-x86_64.S
 
   AESCPPFLAGS = -DAES_ASM -DVPAES_ASM -DBSAES_ASM
 +CPPFLAGS += ${AESCPPFLAGS}
   AESNI = yes
   .include "../../aes.inc"
 diff -urN a/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/sha.inc b/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/sha.inc
 --- a/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/sha.inc	2024-07-16 03:11:54.778622078 +0000
 +++ b/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/sha.inc	2025-04-21 06:42:36.777374180 +0000
 @@ -2,12 +2,10 @@
   SHA_SRCS = sha1-x86_64.S sha1-mb-x86_64.S keccak1600-x86_64.S
   SHACPPFLAGS = -DSHA1_ASM -DKECCAK1600_ASM
   KECCAKNI = yes
 -.if 0
   # This cannot be enabled until the SHA-2 symbol mess is resolved:
   # https://mail-index.netbsd.org/tech-userlevel/2024/03/17/msg014265.html
   # DO NOT TRY TO ENABLE IT, OR YOU MAY CAUSE NETBSD'S OPENSSL TO BE
   # VULNERABLE TO REMOTE CODE EXECUTION BY STACK BUFFER OVERRUNS.
   SHA_SRCS += sha512-x86_64.S sha256-mb-x86_64.S
 -SHACPPFLAGS+= -DSHA256_ASM -DSHA512_ASM
 -.endif
 +SHACPPFLAGS+= -DSHA256_ASM
   .include "../../sha.inc"
 ```
 
 but, note that I had to fiddle with the scary commented out section.
 
 -RVP
 


Home | Main Index | Thread Index | Old Index