Subject: Re: openssl (or gcc) performance changes?
To: NetBSD current list <firstname.lastname@example.org>
From: William Allen Simpson <email@example.com>
Date: 10/02/2003 12:37:34
William Allen Simpson wrote:
> "Perry E. Metzger" wrote:
> > I would suggest that Bill might want to poke around a bit and figure
> > out the exact source of the slowdown.
> The programs involved were posted to PR 21983, so anybody can test
> and/or profile them.
> I only have 1 -current system, but I have another box with nearly
> identical configuration. When I have a bit of time, I'll try loading
> an old releng from around the July timeframe on it, and see whether
> anything is obvious.
OK, here are some of the obvious things:
1) GCC 3.3.1 appears to be faster than 2.95.3 in this application.
Running just qsieve on the same size BN is consistently about
0.1% faster (about 6 seconds in 5665, where runs of 2.95.3 are
always within 1 second of each other).
Running just qsafe on the exact same BN is consistently about
1.9% faster (about 1700 seconds in 92000).
2) Running either 3.3.1 or 2.95.3 with 1.6ZC (openssl 0.9.7b) is
always much slower than 1.6U (openssl 0.9.6b for crypto/bn_prime.c,
0.9.6e for crypto/bn_mont.c, that seem to be the major activity)
when running qsafe (primality tester using BN_is_prime). And by
"much", I mean 30% to 300%, depending on the size of the prime
moduli being tested.
3) Yet, I have not found serious differences in this code. There are
a lot of tiny changes, but they all appear (to me) to be minor
cleanup. My eyes glazed over.
4) However, just this morning, I noticed that the load average is not
what I'd expected.
This run started at "Tue Sep 30 02:53:46"
date && ps u && uptime
Thu Oct 2 12:37:50 EDT 2003
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
current 4848 99.0 1.9 132 1268 p1 RN Tue02AM 3457:08.20 ./qsafe 64
12:37PM up 5 days, 3:16, 3 users, load averages: 1.07, 1.08, 1.08
The 3457 minutes (57.62 hours) seems to be fairly close to what I'd
expect by the clock (57.7 hours).
Does this really mean that although the %CPU is at 99, the load
average is miniscule 1%?
Has something in scheduling changed from 1.6U to 1.6ZC?
I really could use some ideas! As you can see, the tests take days,
so a pointer at a better technique might be helpful!
William Allen Simpson
Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32