Subject: Re: Performance
To: None <M.Hitter@trier.fh-rpl.de>
From: Scott Reynolds <scottr@edsi.org>
List: port-mac68k
Date: 06/03/1996 00:17:34
On Sun, 2 Jun 1996 M.Hitter@trier.fh-rpl.de wrote:

> --- copyvalue.result -----------------------------------
> Start simple copy:
> long: 13 seconds
> int: 27 seconds
> short: 40 seconds
> char: 53 seconds
> Start subroutine calls:
> long: 62 seconds
> int: 67 seconds
> short: 72 seconds
> char: 78 seconds
> --------------------------------------------------------

The solution is simple, I just realized (*smacks forehead*).  The times
are cumulative.

None of these measurements are accurate enough to say anything for
certain.  Longer loops and using gettimeofday() instead of time() would be
more useful.  Here's a representative sample of my results; the program I
used to generate them is appended.

Start simple copy:
 long: 4.181 seconds
  int: 4.169 seconds
short: 4.162 seconds
 char: 4.279 seconds
Start subroutine calls:
 long: 8.733 seconds
  int: 8.752 seconds
short: 11.752 seconds
 char: 17.347 seconds

These results are in line with what I expect, more or less.  A certain
amount of noise in the data can be attributed to the load on my system
(16MHz 030), which was consistently light during these runs.

With an int loop control variable, the copies were about the same.  That
the short copies were slightly faster than others doesn't seem to hold
much significance; one of my runs produced a long copy as fastest, and
short copy as slowest, by similar margins as shown above.

However, all subroutine call loops had the same pattern -- long and int
were virtually identical (which is to be expected), and due to the implied
casting of the short and char arguments, short took about half again as
long and char was about twice as slow.

The moral of the story:  avoid using short ints when passing them as
parameters (and when used in arithmetic with wider types).  The implied
casts will more than kill any gain you get from the slightly more
efficient machine code.

--scott

/* compiled with 'cc -O -o copyvalue copyvalue.c'
   (like the default for the kernel). */

#include <stdio.h>
#include <sys/time.h>

double  diff_timeval(struct timeval *, struct timeval *);
void    calllong(long, long);
void    callint(int, int);
void    callshort(short, short);
void    callchar(char, char);

main()
{
	long    i;
	struct timeval start, end;
	long    lfrom, lto;
	int     ifrom, ito;
	short   sfrom, sto;
	char    cfrom, cto;

	printf("Start simple copy: \n");

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		lto = lfrom;
	gettimeofday(&end, NULL);
	printf(" long: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		ito = ifrom;
	gettimeofday(&end, NULL);
	printf("  int: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		sto = sfrom;
	gettimeofday(&end, NULL);
	printf("short: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		cto = cfrom;
	gettimeofday(&end, NULL);
	printf(" char: %.3f seconds\n", diff_timeval(&end, &start));

	printf("Start subroutine calls: \n");

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		calllong(lfrom, lto);
	gettimeofday(&end, NULL);
	printf(" long: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		callint(ifrom, ito);
	gettimeofday(&end, NULL);
	printf("  int: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		callshort(sfrom, sto);
	gettimeofday(&end, NULL);
	printf("short: %.3f seconds\n", diff_timeval(&end, &start));

	gettimeofday(&start, NULL);
	for (i = 0; i < 2000000; i++)
		callchar(cfrom, cto);
	gettimeofday(&end, NULL);
	printf(" char: %.3f seconds\n", diff_timeval(&end, &start));

	exit(0);
}

double
diff_timeval(struct timeval * ep, struct timeval * sp)
{
	if (sp->tv_usec > ep->tv_usec) {
		ep->tv_usec += 1000000;
		--ep->tv_sec;
	}
	return (double) (ep->tv_sec - sp->tv_sec) +
	    ((double) (ep->tv_usec - sp->tv_usec) / 1000000.0);
}

void 
calllong(long a, long b)
{
}
void 
callint(int a, int b)
{
}
void 
callshort(short a, short b)
{
}
void 
callchar(char a, char b)
{
}