Subject: Re: shell script performance improvement
To: None <port-mips@netbsd.org>
From: Simon Burge <simonb@netbsd.org>
List: port-mips
Date: 03/27/2000 12:11:55
Here's my tests for a 5000/240 vs. 5900-60.  I'll include a brief
commentary below.

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh  
                             call  I/O stat clos       inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
mips-dec-    ULTRIX 4.5   39  7.9       177  226 0.78K 18.2   88 3.5K  11K  24K
mips-dec-    ULTRIX 4.5  117  3.7  28.   80   99 0.39K 13.8   41 5.6K  14K  30K
pmax-netb   NetBSD 1.4W   39 12.1  49.  339  319 0.70K 22.6   52 64.K 182K 358K
pmax-netb   NetBSD 1.4W  117  4.3  22.  121  141 0.31K  9.2   25 5.8K  43K  67K

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host                 OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                        ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
mips-dec-    ULTRIX 4.5   13    186    923   392   1029     440    1062
mips-dec-    ULTRIX 4.5   46    356    963   251   1493     296    1738
pmax-netb   NetBSD 1.4W   27    207    890   439   1089     489    1113
pmax-netb   NetBSD 1.4W   21    157    554   223   1261     282    1602

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
mips-dec-    ULTRIX 4.5    13   228  292   975         635      23006
mips-dec-    ULTRIX 4.5    46   101  146   404         302       1678
pmax-netb   NetBSD 1.4W    27   196  172
pmax-netb   NetBSD 1.4W    21   108  110

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host                 OS   0K File      10K File      Mmap    Prot    Page       
                        Create Delete Create Delete  Latency Fault   Fault 
--------- ------------- ------ ------ ------ ------  ------- -----   ----- 
mips-dec-    ULTRIX 4.5    254     58    751    131                      
mips-dec-    ULTRIX 4.5    189     61   1265    128                       
pmax-netb   NetBSD 1.4W   1851   1149   5000   3448   231177    25    9.5K
pmax-netb   NetBSD 1.4W   2857   1149   4761   3333   317606     5    9.2K

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
mips-dec-    ULTRIX 4.5    8   11                        23     23   33    86
mips-dec-    ULTRIX 4.5   16   10           1            10      9   23    18
pmax-netb   NetBSD 1.4W    9    9   -1      9     33     24     25   33    86
pmax-netb   NetBSD 1.4W   15   14   -1      9     24     10     10   24    18

Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
---------------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    Guesses
--------- -------------   ---  ----   ----    --------    -------
mips-dec-    ULTRIX 4.5    39    50   1472        1456    No L2 cache?
mips-dec-    ULTRIX 4.5   117    23    281        1269
pmax-netb   NetBSD 1.4W    39    50   1249        1443    No L2 cache?
pmax-netb   NetBSD 1.4W   117    25    333        1257


1) On the r4k, fork times are comparable with Ultrix, but "exec proc"
   and "sh proc" are still slow for each CPU type.  There's something
   really wrong with the MIPS1 code somewhere.

2) System calls are slower than Ultrix

3) The file create/delete times under Ultrix are on machines that have
   prestoserve installed and active.  This is not a fair comparison with
   NetBSD.

4) The "mmap latency" benchmark for 1.4.1 on a 5000/240 is only 49203,
   so something has gotten really out of shape since then.

5) I'll look into why we didn't get an UCP/TCP results for NetBSD.


The benchmarks themselves are in lmbench/src/lat_proc.c.

The "fork proc" test basically is:

	void do_fork(void)
	{
		int     pid;

		switch (pid = fork()) {
		    case -1:
			perror("fork");
			exit(1);

		    case 0:     /* child */
			exit(1);

		    default:
			while (wait(0) != pid)
			    ;
		}
	}

The "exec proc" test basically is:

	void do_forkexec(void)
	{
		int     pid;
		char    *nav[2];

		nav[0] = PROG;
		nav[1] = 0;
		switch (pid = fork()) {
		    case -1:
			perror("fork");
			exit(1);

		    case 0:     /* child */
			close(1);
			execve(PROG, nav, 0);
			exit(1);

		    default:
			while (wait(0) != pid)
				;
		}
	}

The "sh proc" basically is:

	void do_shell(void)
	{
		int     pid;

		switch (pid = fork()) {
		    case -1:
			perror("fork");
			exit(1);

		    case 0:     /* child */
			close(1);
			execlp("/bin/sh", "sh", "-c", PROG, 0);
			exit(1);

		    default:
			while (wait(0) != pid)
				;
		}
	}

"PROG" is:

	int
	main()
	{
		write(1, "Hello world\n", 12);
		return (0);
	}

Simon.