Subject: Re: CVS commit: htdocs/Ports/sgimips
To: None <port-sgimips@netbsd.org>
From: Pavel Cahyna <pcah8322@artax.karlin.mff.cuni.cz>
List: port-sgimips
Date: 10/18/2005 15:26:17
I tested the speed of a kernel with L2 cache patch built by Tsutsui-san.
Results show that the patch probably works as expected, thanks!
openssl speed shows some improvement for the RSA numbers:
----------------------------------
# 2.0.2 kernel (3.99.5 is similar)
OpenSSL 0.9.7d 17 Mar 2004
built on: NetBSD 2.0.2
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: gcc version 3.3.3 (NetBSD nb3 20040520)
available timing options: USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 69.70k 143.32k 195.15k 214.54k 221.09k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 703.75k 2391.73k 6547.48k 11574.18k 14898.21k
md5 527.37k 1401.56k 3975.47k 7671.08k 10756.88k
hmac(md5) 871.60k 2677.14k 6291.70k 9327.22k 11114.04k
sha1 357.43k 1067.68k 2758.33k 4567.58k 5641.01k
rmd160 326.35k 534.62k 1567.75k 3044.26k 4197.30k
rc4 6434.60k 7242.23k 7477.01k 7537.17k 7551.68k
des cbc 1186.65k 1230.35k 1242.24k 1244.24k 1244.93k
des ede3 436.07k 442.44k 444.20k 444.52k 444.39k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 1358.28k 1418.57k 1434.42k 1437.14k 1438.46k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 2928.43k 3223.89k 3308.59k 3329.24k 3322.73k
cast cbc 1850.39k 1963.51k 1994.23k 2002.64k 2000.04k
aes-128 cbc 1629.00k 1676.46k 1688.07k 1690.13k 1687.87k
aes-192 cbc 1409.06k 1444.35k 1453.58k 1455.16k 1452.97k
aes-256 cbc 1239.85k 1265.87k 1273.16k 1274.87k 1271.86k
sign verify sign/s verify/s
rsa 512 bits 0.0417s 0.0039s 24.0 255.5
rsa 1024 bits 0.1941s 0.0109s 5.2 91.3
rsa 2048 bits 1.1627s 0.0368s 0.9 27.2
rsa 4096 bits 7.9672s 0.1341s 0.1 7.5
sign verify sign/s verify/s
dsa 512 bits 0.0311s 0.0384s 32.2 26.0
dsa 1024 bits 0.0955s 0.1182s 10.5 8.5
dsa 2048 bits 0.3361s 0.4188s 3.0 2.4
----------------------------------
----------------------------------
# patched kernel:
OpenSSL 0.9.7d 17 Mar 2004
built on: NetBSD 2.0.2
options:bn(32,32) md2(int) rc4(ptr,int) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: gcc version 3.3.3 (NetBSD nb3 20040520)
available timing options: USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 69.15k 143.24k 195.12k 214.57k 221.03k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 706.21k 2396.16k 6560.78k 11590.58k 14922.89k
md5 579.50k 1789.38k 4859.34k 8511.63k 10968.92k
hmac(md5) 871.43k 2679.93k 6295.50k 9475.68k 11146.05k
sha1 470.51k 1398.36k 3258.44k 4885.01k 5715.31k
rmd160 421.80k 945.10k 2310.91k 3616.55k 4331.32k
rc4 6437.29k 7235.76k 7477.91k 7539.55k 7548.35k
des cbc 1186.88k 1231.00k 1242.52k 1244.64k 1245.89k
des ede3 436.66k 443.27k 445.06k 445.38k 445.08k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 1360.00k 1418.41k 1434.55k 1438.57k 1440.47k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 2929.79k 3227.36k 3309.88k 3331.39k 3327.69k
cast cbc 1853.36k 1965.40k 1996.68k 2004.73k 2005.90k
aes-128 cbc 1625.68k 1672.40k 1684.36k 1687.42k 1687.83k
aes-192 cbc 1408.43k 1443.85k 1452.79k 1455.04k 1453.73k
aes-256 cbc 1241.13k 1268.60k 1275.22k 1277.42k 1277.28k
sign verify sign/s verify/s
rsa 512 bits 0.0318s 0.0034s 31.5 294.2
rsa 1024 bits 0.1725s 0.0103s 5.8 97.0
rsa 2048 bits 1.1174s 0.0360s 0.9 27.8
rsa 4096 bits 7.8438s 0.1326s 0.1 7.5
sign verify sign/s verify/s
dsa 512 bits 0.0272s 0.0333s 36.7 30.1
dsa 1024 bits 0.0907s 0.1116s 11.0 9.0
dsa 2048 bits 0.3255s 0.4121s 3.1 2.4
----------------------------------
iperf also shows notable improvements:
# 3.99.5 kernel:
------------------------------------------------------------
Client connecting to pc111, TCP port 5001
TCP window size: 17.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 3] local xxx.xxx.xxx.xxx port 65534 connected with xxx.xxx.xxx.xxx port 5001
[ 3] 0.0-100.0 sec 320 MBytes 26.9 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte
------------------------------------------------------------
[ 4] local xxx.xxx.xxx.xxx port 5001 connected with xxx.xxx.xxx.xxx port 2024
[ 4] 0.0-10.0 sec 49.8 MBytes 41.8 Mbits/sec
# patched kernel:
------------------------------------------------------------
Client connecting to pc111, TCP port 5001
TCP window size: 17.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 3] local xxx.xxx.xxx.xxx port 65531 connected with xxx.xxx.xxx.xxx port 5001
[ 3] 0.0-100.0 sec 410 MBytes 34.4 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte
------------------------------------------------------------
[ 4] local xxx.xxx.xxx.xxx port 5001 connected with xxx.xxx.xxx.xxx port 2022
[ 4] 0.0-10.0 sec 60.8 MBytes 51.0 Mbits/sec
[ 4] local xxx.xxx.xxx.xxx port 5001 connected with xxx.xxx.xxx.xxx port 2023
[ 4] 0.0-10.0 sec 60.8 MBytes 50.9 Mbits/sec
here is the difference of dmesg, which shows one probably unrelated
improvement: the setting for SCSI devices are better.
--- netbsd/O2-3.99.5.good 2005-10-18 15:00:53.000000000 +0200
+++ netbsd/O2-3.99.9.cache.good 2005-10-18 15:18:57.000000000 +0200
@@ -3,8 +3,8 @@
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
-NetBSD 3.99.5 (GENERIC32_IP3x) #0: Tue May 31 00:58:19 UTC 2005
- builds@works.netbsd.org:/home/builds/ab/HEAD/sgimips/200505290000Z-obj/home/builds/ab/HEAD/src/sys/arch/sgimips/compile/GENERIC32_IP3x
+NetBSD 3.99.9 (GENERIC32_IP3x) #4: Tue Oct 18 08:16:26 JST 2005
+ tsutsui@mirage:/usr/src/sys/arch/sgimips/compile/GENERIC32_IP3x
total memory = 256 MB
(3716 KB reserved for ARCS)
avail memory = 242 MB
@@ -12,7 +12,7 @@
cpu0 at mainbus0: MIPS R5000 CPU (0x2321) Rev. 2.1 with built-in FPU Rev. 1.0
cpu0: 32KB/32B 2-way set-associative L1 Instruction cache, 48 TLB entries
cpu0: 32KB/32B 2-way set-associative write-back L1 Data cache
-cpu0: 512KB/32B direct-mapped write-through L2 Data cache
+cpu0: 512KB/32B direct-mapped write-through L2 Unified cache
crime0 at mainbus0 addr 0x14000000: rev 1.1 (CRIME_ID: a1)
mace0 at mainbus0 addr 0x1f000000
lpt0 at mace0 offset 0x380000 intr 4 intrmask 0xf0000
@@ -41,22 +41,24 @@
mace: established interrupt 8 (level 0)
ahc0: interrupting at crime interrupt 8
ahc0: Using left over BIOS settings
-ahc0: aic7880: Wide Channel A, SCSI Id=0, 16/253 SCBs
+ahc0: Host Adapter Bios disabled. Using default SCSI device parameters
+ahc0: aic7880: Wide Channel A, SCSI Id=7, 16/253 SCBs
scsibus0 at ahc0: 16 targets, 8 luns per target
ahc1 at pci0 dev 2 function 0: Adaptec aic7880 Ultra SCSI adapter
mace: established interrupt 9 (level 0)
ahc1: interrupting at crime interrupt 9
ahc1: Using left over BIOS settings
-ahc1: aic7880: Wide Channel A, SCSI Id=0, 16/253 SCBs
+ahc1: Host Adapter Bios disabled. Using default SCSI device parameters
+ahc1: aic7880: Wide Channel A, SCSI Id=7, 16/253 SCBs
scsibus1 at ahc1: 16 targets, 8 luns per target
biomask 07 netmask 07 ttymask 07 clockmask 87
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 1 lun 0: <SGI, IBM DCHS04Y, 3030> disk fixed
sd0: 4340 MB, 6077 cyl, 9 head, 162 sec, 512 bytes/sect x 8888543 sectors
-sd0: async, 8-bit transfers, tagged queueing
+sd0: sync (100.00ns offset 8), 16-bit (20.000MB/s) transfers, tagged queueing
cd0 at scsibus0 target 4 lun 0: <TOSHIBA, CD-ROM XM-5401TA, 3605> cdrom removable
-cd0: async, 8-bit transfers
+cd0: sync (236.00ns offset 15), 8-bit (4.237MB/s) transfers
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs