Subject: some performance comparisons for 1.3.3 between Pentium and Pentium-II....
To: NetBSD/i386 Discussion List <port-i386@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: port-i386
Date: 02/04/1999 23:43:51
I decided to run bytebench on a couple of machines to see how they
compare.
The first is a little Pentium-133MHz server with 48MB of RAM on an Intel
TC430HX (PCI) motherboard with 512KB cache, and a Fujitsu MPC3043AT IDE
drive. This machine is using a serial console.
The second is an IBM PC-Server 325 with a single Pentium-II 300MHz,
128MB of RAM, on-board AIC-7880 and several IBM Ultrastar-9LP UltraWide
SCSI drives, two of which were striped together with CCD. This machine
as an on-board S3 Trio32/64 VGA display, but is also actually using a
serial console.
Both machines are running NetBSD 1.3.3 with reasonably similar kernels.
The primary differences (other than hard-wired devices and different
Ethernet drivers) are ["<" == P133, ">" == PII300]:
8c8
< maxusers 32 # estimated number of users
---
> maxusers 128 # estimated number of users
427c444
< pseudo-device bpfilter 8 # Berkeley packet filter
---
> pseudo-device bpfilter 12 # Berkeley packet filter
430,432c447,449
< pseudo-device ppp 2 # Point-to-Point Protocol
< pseudo-device sl 2 # Serial Line IP
< pseudo-device strip 2 # Starmode Radio IP (Metricom)
---
> #pseudo-device ppp 2 # Point-to-Point Protocol
> #pseudo-device sl 2 # Serial Line IP
> #pseudo-device strip 2 # Starmode Radio IP (Metricom)
436c453
< pseudo-device pty 64 # pseudo-terminals
---
> pseudo-device pty 96 # pseudo-terminals
The third machine is a similar PC-Server 325 (64MB RAM, Quantum Viking
UltraWide disks) running a GENERIC NetBSD-current 1.3I kernel of about
98/12/02 (i.e. maxusers=32).
The first two are running the NetBSD native RC5DES Client v2.7024.409,
and the latter one the rc5des432-bsdi2-x86-nomt client.
The P133 and -current machines are also being queried reqularly for CPU
stats via rstatd by a cool little tool called cpupie.tcl, and the P133
was running a couple of xterm, one in which the bytebench run was
initiated and displayed it's progress (and which both suffered refresh
events as I bopped about on my virtual desktops). The -current machine
was running as an NFS server and doing other minor tasks
I.e. the P133 and the -current machines were slightly more loaded by
other tasks than than the 1.3.3 PII300.
None of these machines was doing anything else appreciable (an sshd,
xndpd, NFS clients, etc.) during the bytebench runs.
I also noticed that there was a *lot* more "nice" CPU being used than I
expected would be, especially during such a CPU-heavy benchmark.
Lastly I should point out that that the average relative RC5DES key
rates are approximately as one might expect for these comparative
processors: just over 189,000 keys/s for the P133 and about 840,000
keys/s for the PII300s (i.e. the PII is about twice as fast per MHz).
(BTW, early in my tests I was extremely surprised to find the benchmark
took almost exactly the same amount of elapsed time to complete on all
machines, then I remembered that bytebench runs each test on a fixed
timer, and counts the total number of iterations!)
Here are the final results, P133 first, the -current machine last:
==============================================================
NetBSD becoming 1.3.3 NetBSD 1.3.3 (BECOMING)
BYTE UNIX Benchmarks (Version 3.11)
System -- becoming
Start Benchmark Run: Thu Feb 4 22:03:41 EST 1999
3 interactive users.
Dhrystone 2 without register variables 158708.1 lps (10 secs, 6 samples)
Dhrystone 2 using register variables 156789.0 lps (10 secs, 6 samples)
Arithmetic Test (type = arithoh) 529193.9 lps (10 secs, 6 samples)
Arithmetic Test (type = register) 17800.2 lps (10 secs, 6 samples)
Arithmetic Test (type = short) 15603.2 lps (10 secs, 6 samples)
Arithmetic Test (type = int) 17868.6 lps (10 secs, 6 samples)
Arithmetic Test (type = long) 17763.5 lps (10 secs, 6 samples)
Arithmetic Test (type = float) 23029.9 lps (10 secs, 6 samples)
Arithmetic Test (type = double) 23062.1 lps (10 secs, 6 samples)
System Call Overhead Test 50199.0 lps (10 secs, 6 samples)
Pipe Throughput Test 22571.0 lps (10 secs, 6 samples)
Pipe-based Context Switching Test 7661.4 lps (10 secs, 6 samples)
Process Creation Test 335.0 lps (10 secs, 6 samples)
Execl Throughput Test 68.5 lps (9 secs, 6 samples)
File Read (10 seconds) 78619.0 KBps (10 secs, 6 samples)
File Write (10 seconds) 5074.0 KBps (10 secs, 6 samples)
File Copy (10 seconds) 4512.0 KBps (10 secs, 6 samples)
File Read (30 seconds) 71178.0 KBps (30 secs, 6 samples)
File Write (30 seconds) 5066.0 KBps (30 secs, 6 samples)
File Copy (30 seconds) 4027.0 KBps (30 secs, 6 samples)
C Compiler Test 153.1 lpm (60 secs, 3 samples)
Shell scripts (1 concurrent) 222.9 lpm (60 secs, 3 samples)
Shell scripts (2 concurrent) 114.0 lpm (60 secs, 3 samples)
Shell scripts (4 concurrent) 57.0 lpm (60 secs, 3 samples)
Shell scripts (8 concurrent) 28.0 lpm (60 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places 3397.3 lpm (60 secs, 6 samples)
Recursion Test--Tower of Hanoi 2558.7 lps (10 secs, 6 samples)
INDEX VALUES
TEST BASELINE RESULT INDEX
Arithmetic Test (type = double) 2541.7 23062.1 9.1
Dhrystone 2 without register variables 22366.3 158708.1 7.1
Execl Throughput Test 16.5 68.5 4.2
File Copy (30 seconds) 179.0 4027.0 22.5
Pipe-based Context Switching Test 1318.5 7661.4 5.8
Shell scripts (8 concurrent) 4.0 28.0 7.0
=========
SUM of 6 items 55.6
AVERAGE 9.3
==============================================================
NetBSD public 1.3.3 NetBSD 1.3.3 (ACI)
BYTE UNIX Benchmarks (Version 3.11)
System -- public
Start Benchmark Run: Thu Feb 4 22:03:40 EST 1999
1 interactive users.
Dhrystone 2 without register variables 545551.6 lps (10 secs, 6 samples)
Dhrystone 2 using register variables 540259.7 lps (10 secs, 6 samples)
Arithmetic Test (type = arithoh) 1216689.5 lps (10 secs, 6 samples)
Arithmetic Test (type = register) 72192.3 lps (10 secs, 6 samples)
Arithmetic Test (type = short) 40777.0 lps (10 secs, 6 samples)
Arithmetic Test (type = int) 72170.5 lps (10 secs, 6 samples)
Arithmetic Test (type = long) 72309.0 lps (10 secs, 6 samples)
Arithmetic Test (type = float) 82724.6 lps (10 secs, 6 samples)
Arithmetic Test (type = double) 82397.9 lps (10 secs, 6 samples)
System Call Overhead Test 110749.4 lps (10 secs, 6 samples)
Pipe Throughput Test 82137.7 lps (10 secs, 6 samples)
Pipe-based Context Switching Test 22224.7 lps (10 secs, 6 samples)
Process Creation Test 745.4 lps (10 secs, 6 samples)
Execl Throughput Test 208.2 lps (9 secs, 6 samples)
File Read (10 seconds) 267382.0 KBps (10 secs, 6 samples)
File Write (10 seconds) 9200.0 KBps (10 secs, 6 samples)
File Copy (10 seconds) 9010.0 KBps (10 secs, 6 samples)
File Read (30 seconds) 239705.0 KBps (30 secs, 6 samples)
File Write (30 seconds) 8821.0 KBps (30 secs, 6 samples)
File Copy (30 seconds) 8508.0 KBps (30 secs, 6 samples)
C Compiler Test 255.7 lpm (60 secs, 3 samples)
Shell scripts (1 concurrent) 288.0 lpm (60 secs, 3 samples)
Shell scripts (2 concurrent) 150.7 lpm (60 secs, 3 samples)
Shell scripts (4 concurrent) 79.3 lpm (60 secs, 3 samples)
Shell scripts (8 concurrent) 38.3 lpm (60 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places 9236.7 lpm (60 secs, 6 samples)
Recursion Test--Tower of Hanoi 6408.2 lps (10 secs, 6 samples)
INDEX VALUES
TEST BASELINE RESULT INDEX
Arithmetic Test (type = double) 2541.7 82397.9 32.4
Dhrystone 2 without register variables 22366.3 545551.6 24.4
Execl Throughput Test 16.5 208.2 12.6
File Copy (30 seconds) 179.0 8508.0 47.5
Pipe-based Context Switching Test 1318.5 22224.7 16.9
Shell scripts (8 concurrent) 4.0 38.3 9.6
=========
SUM of 6 items 143.4
AVERAGE 23.9
==============================================================
NetBSD proven 1.3I NetBSD 1.3I (GENERIC)
BYTE UNIX Benchmarks (Version 3.11)
System -- proven
Start Benchmark Run: Thu Feb 4 17:36:57 EST 1999
8 interactive users.
Dhrystone 2 without register variables 532987.4 lps (10 secs, 6 samples)
Dhrystone 2 using register variables 536171.0 lps (10 secs, 6 samples)
Arithmetic Test (type = arithoh) 1149515.4 lps (10 secs, 6 samples)
Arithmetic Test (type = register) 68351.7 lps (10 secs, 6 samples)
Arithmetic Test (type = short) 40720.6 lps (10 secs, 6 samples)
Arithmetic Test (type = int) 72424.0 lps (10 secs, 6 samples)
Arithmetic Test (type = long) 71910.9 lps (10 secs, 6 samples)
Arithmetic Test (type = float) 83142.5 lps (10 secs, 6 samples)
Arithmetic Test (type = double) 82692.4 lps (10 secs, 6 samples)
System Call Overhead Test 108290.3 lps (10 secs, 6 samples)
Pipe Throughput Test 82363.6 lps (10 secs, 6 samples)
Pipe-based Context Switching Test 26330.1 lps (10 secs, 6 samples)
Process Creation Test 753.0 lps (10 secs, 6 samples)
Execl Throughput Test 224.9 lps (9 secs, 6 samples)
File Read (10 seconds) 243405.0 KBps (10 secs, 6 samples)
File Write (10 seconds) 4068.0 KBps (10 secs, 6 samples)
File Copy (10 seconds) 4356.0 KBps (10 secs, 6 samples)
File Read (30 seconds) 209660.0 KBps (30 secs, 6 samples)
File Write (30 seconds) 4422.0 KBps (30 secs, 6 samples)
File Copy (30 seconds) 4149.0 KBps (30 secs, 6 samples)
C Compiler Test 210.6 lpm (60 secs, 3 samples)
Shell scripts (1 concurrent) 268.7 lpm (60 secs, 3 samples)
Shell scripts (2 concurrent) 161.0 lpm (60 secs, 3 samples)
Shell scripts (4 concurrent) 85.0 lpm (60 secs, 3 samples)
Shell scripts (8 concurrent) 43.0 lpm (60 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places 10310.1 lpm (60 secs, 6 samples)
Recursion Test--Tower of Hanoi 6269.3 lps (10 secs, 6 samples)
INDEX VALUES
TEST BASELINE RESULT INDEX
Arithmetic Test (type = double) 2541.7 82692.4 32.5
Dhrystone 2 without register variables 22366.3 532987.4 23.8
Execl Throughput Test 16.5 224.9 13.6
File Copy (30 seconds) 179.0 4149.0 23.2
Pipe-based Context Switching Test 1318.5 26330.1 20.0
Shell scripts (8 concurrent) 4.0 43.0 10.8
=========
SUM of 6 items 123.9
AVERAGE 20.6
--
Greg A. Woods
+1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>