Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)

At Thu, 29 Oct 2009 22:30:23 -0400, Thor Lancelot Simon 
<> wrote:
Subject: Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX 
Security Block)
> You may need to explicitly specify -engine cryptodev, and note that you
> will not get *any* accelleration from openssl speed for any cipher
> unless you specify it as an "evp" instead of by the shortcut name:
> openssl speed -engine cryptodev -elapsed -evp aes-128-cbc

I'm not sure I understand.  None of the examples I saw on the NetBSD
lists show this (and it's not explained at all in the manual page).

It looks like the algorithm can also be given on the command line:

  openssl speed -engine cryptodev -elapsed -evp aes-128-cbc aes-128-cbc

and then the program seems to runs the test twice, once in a way that
will make use of /dev/crypto.

"-engine cryptodev" does now indeed make the huge difference I was
expecting, and I see the same kinds of stats others have posted.

I've since found similar examples using "-evp aes-128-cbc" on the
FreeBSD lists (regarding the same driver and device), as well as other
tests that make use of the device such as:

  # dd if=/dev/zero bs=4k count=100000 | \
    openssl enc -aes-128-cbc -e -out /dev/null -nosalt -k abcdefhij -engine 
  10000+0 records in
  10000+0 records out
  81920000 bytes transferred in 5.465 secs (14989935 bytes/sec)

I can also confirm that on NetBSD-4 with the native OpenSSL 0.9.8e the
"cryptodev" engine must be specified in order to make use of the device.

# for i in 1 2 3 4 5 6 7 8 9 0 ; do
        openssl speed -multi 10 -evp aes-128-cbc -elapsed 2>/dev/null | tail -1;
   done | awk '
        {n1=$1; t1+=$2; t2+=$3; t3+=$4; t4+=$5; t5+=$6;}
        END{printf("%-13s %11.2fk %11.2fk %11.2fk %11.2fk %11.2fk  (%d runs)\n",
                n1, t1/NR, t2/NR, t3/NR, t4/NR, t5/NR, NR)}'
evp                310.16k     1229.65k     4354.50k    10540.60k    62369.80k  
(10 runs)

# sysctl -w kern.usercrypto=0
evp               4917.08k     5519.23k     5746.64k     5808.70k     8549.20k  
(10 runs)

For comparison my Dell PE2650 2*2.4GHz HTT server gets:

evp              22753.38k    26595.67k    31588.73k    31056.11k    35666.74k  
(10 runs)

> FWIW, glxsb is not very efficient and the syscall overhead will just
> kill you for all but very large requests.  You may see better results
> with -multi 32 to get some parallelism going to hide the latency.


Thank you very much!

                                                Greg A. Woods
                                                Planix, Inc.

<>       +1 416 218 0099

