Subject: Re: OpenSSL RSA very very slow
To: J.T. Conklin <jtc@acorntoolworks.com>
From: Johnny C. Lam <jlam@NetBSD.org>
List: port-amd64
Date: 01/13/2005 18:01:56
On Thu, Jan 13, 2005 at 06:09:26AM -0800, J.T. Conklin wrote:
> "Johnny C. Lam" <jlam@NetBSD.org> writes:
> >> FWIW, the pkgsrc openssl already builds with these optimizations.  I
> >> had copied part of the linux-x86_64 openssl config when creating the
> >> entry in the Configure script for NetBSD/amd64.  The only remaining
> >> part of the linux-x86_64 config that I haven't copied over is RC4_CHUNK,
> >> which is supposed to "enable code that handles data aligned at long
> >> (natural CPU word) boundary).  I'll test a build with and without this
> >> extra definition to see if the RSA speed differs.
> >
> > The output from "openssl speed rsa" is basically the same between
> > openssl built with and without RC4_CHUNK defined.  I will leave the
> > pkgsrc openssl the way that it currently is.
> 
> That would make sense, since AFAIK RC4_CHUNK is only used for the RC4
> implementation.  Is there any difference with "openssl speed rc4"?

These are the results of running "openssl speed $alg" for various $alg:

OpenSSL 0.9.7d (NetBSD 2.0_STABLE)
======================================================
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,4,int) aes(partial) blowfish(idx) 
compiler: gcc version 3.3.3 (NetBSD nb3 20040520)
md2               1266.76k     2675.23k     3707.54k     4094.51k     4225.70k
rc4             120561.91k   131208.27k   133609.41k   134724.82k   135140.25k
sha1             13256.12k    38947.06k    92326.33k   140224.43k   165506.85k
rmd160            9188.77k    24984.33k    51711.99k    70548.63k    78992.53k
des cbc          32900.26k    34654.87k    35237.14k    35378.52k    35381.70k
des ede3         12956.07k    13256.68k    13331.77k    13350.59k    13356.19k
aes-128 cbc      76239.27k    85067.93k    81337.79k    81675.00k    81709.39k
aes-192 cbc      68104.06k    70665.19k    71981.01k    72270.73k    72392.51k
aes-256 cbc      61388.14k    62874.22k    64577.72k    64811.35k    64909.94k
blowfish cbc     57792.73k    62021.65k    62992.07k    63370.59k    63427.86k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0012s   0.0001s    849.4   7462.7
rsa 1024 bits   0.0068s   0.0004s    146.6   2471.4
rsa 2048 bits   0.0438s   0.0014s     22.9    713.4
rsa 4096 bits   0.3060s   0.0051s      3.3    197.2


OpenSSL 0.9.7e (SIXTY_FOUR_BIT_LONG RC4_CHUNK BF_PTR2 DES_INT DES_UNROLL)
=========================================================================
options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(ptr2)
compiler: gcc -fPIC -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_IDEA -DOPENSSL_NO_RC5 -DOPENSSL_NO_MDC2 -O2 -march=athlon-mp -DTERMIOS -DL_ENDIAN -DMD32_REG_T=int -O2
md2               1269.98k     2678.03k     3714.94k     4109.57k     4241.76k
rc4             118630.26k   130710.79k   133200.41k   134652.99k   135046.55k
sha1             12670.60k    39026.87k    92295.54k   140290.06k   165557.05k
rmd160            9707.20k    26822.16k    55859.52k    76605.68k    85994.82k
des cbc          32740.13k    34286.67k    34649.64k    34793.37k    34833.54k
des ede3         13042.50k    13231.94k    13330.39k    13350.21k    13355.96k
aes-128 cbc      77190.03k    80356.79k    81827.17k    82188.00k    82225.86k
aes-192 cbc      68297.52k    71120.59k    72383.21k    72674.52k    72735.79k
aes-256 cbc      61292.62k    64017.44k    64974.08k    65196.99k    65265.41k
blowfish cbc     58712.34k    63287.33k    64300.53k    64799.73k    64917.34k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0003s   0.0000s   3325.1  38862.5
rsa 1024 bits   0.0009s   0.0001s   1061.2  17349.3
rsa 2048 bits   0.0049s   0.0002s    204.5   6478.5
rsa 4096 bits   0.0301s   0.0005s     33.2   2086.3

The second group of results is the best from different sets of options,
and I'm committing pkgsrc openssl to using this set.

	Cheers,

	-- Johnny Lam <jlam@NetBSD.org>