Subject: Re: HyperThread Technology
To: Steven M. Bellovin <smb@research.att.com>
From: MLH <mlh@goathill.org>
List: port-i386
Date: 07/08/2004 14:43:29
>
> In message <20040708052949.GC22720@kyyhky.embedtronics.fi>, Jukka Marin writes:
> >On Wed, Jul 07, 2004 at 09:16:34PM -0500, MLH wrote:
> >> Just missed it on current-users :
> >>
> >> http://mail-index.netbsd.org/current-users/2004/06/26/0012.html
> >
> >Did you do a full build with the MP kernel already? How long did it
> >take? How long did it take with the GENERIC kernel?
> >
> >Trying to find out how useful HT is in real work.. :)
> >
> It saves me a little -- 10-15%, I think -- on 'make sets'.
In my case I see very little gain and sometimes a loss. For example,
performing the following (after cleaning each time):
./build.sh -u -U -M /opt/obj -R /opt/snapshot tools kernel=ACPI.MP && \
./build.sh -u -U -M /opt/obj -R /opt/snapshot tools releasekernel=ACPI.MP
with sources over nfs and obj on a local 10krpm ST3120026AS sata drive :
NetBSD bam.sfbr.org 2.0_BETA NetBSD 2.0_BETA (ACPI.MP) :
47.508u 20.102s 1:32.84 72.8% 0+0k 448+664io 2434pf+0w
NetBSD bam.sfbr.org 2.0_BETA NetBSD 2.0_BETA (GENERIC.MP) :
45.349u 17.776s 1:35.00 66.4% 0+0k 450+675io 2432pf+0w
Some benchmarks:
------------------------------------
NetBSD bam.sfbr.org 2.0_BETA NetBSD 2.0_BETA (ACPI.MP)
pystones:
This machine benchmarks at 34013.6 pystones/second
flops:
FLOPS C Program (Double Precision), V2.0 18 Dec 1992
Module Error RunTime MFLOPS
(usec)
1 4.0146e-13 0.0109 1282.1781
2 -1.4166e-13 0.0113 616.9612
3 4.7184e-14 0.0194 878.1818
4 -1.2557e-13 0.0137 1093.6184
5 -1.3800e-13 0.0264 1097.7967
6 3.2380e-13 0.0255 1136.6024
7 -8.4583e-11 0.0332 361.6202
8 3.4867e-13 0.0255 1175.9014
Iterations = 512000000
NullTime (usec) = 0.0004
MFLOPS(1) = 683.4207
MFLOPS(2) = 664.7419
MFLOPS(3) = 944.2470
MFLOPS(4) = 1082.0311
linpacks:
Rolled Single Precision Linpack
norm. resid resid machep x[0]-1 x[n-1]-1
1.9 4.52336171e-05 1.19209290e-07 -1.31130219e-05 -1.30534172e-05
times are reported for matrices of order 100
dgefa dgesl total kflops unit ratio
times for array with leading dimension of 201
0.00 0.00 0.00 1046748 0.00 0.01
0.00 0.00 0.00 1049949 0.00 0.01
0.00 0.00 0.00 1043567 0.00 0.01
0.00 0.00 0.00 1044996 0.00 0.01
times for array with leading dimension of 200
0.00 0.00 0.00 1037261 0.00 0.01
0.00 0.00 0.00 1037260 0.00 0.01
0.00 0.00 0.00 1032580 0.00 0.01
0.00 0.00 0.00 1048186 0.00 0.01
Rolled Single Precision 1044996 Kflops ; 10 Reps
linpackd:
Rolled Double Precision Linpack
norm. resid resid machep x[0]-1 x[n-1]-1
1.7 7.41628980e-14 2.22044605e-16 -1.49880108e-14 -1.89848137e-14
times are reported for matrices of order 100
dgefa dgesl total kflops unit ratio
times for array with leading dimension of 201
0.00 0.00 0.00 910698 0.00 0.01
0.00 0.00 0.00 915556 0.00 0.01
0.00 0.00 0.00 915556 0.00 0.01
0.00 0.00 0.00 915922 0.00 0.01
times for array with leading dimension of 200
0.00 0.00 0.00 897603 0.00 0.01
0.00 0.00 0.00 856193 0.00 0.01
0.00 0.00 0.00 842536 0.00 0.01
0.00 0.00 0.00 843157 0.00 0.01
Rolled Double Precision 843157 Kflops ; 10 Reps
dry2: 10000000 runs
Microseconds for one run through Dhrystone: 0.2
Dhrystones per Second: 4926108.5
------------------------------------
NetBSD bam.sfbr.org 2.0_BETA NetBSD 2.0_BETA (GENERIC.MP) :
pystones:
This machine benchmarks at 33974.3 pystones/second
flops:
FLOPS C Program (Double Precision), V2.0 18 Dec 1992
Module Error RunTime MFLOPS
(usec)
1 4.0146e-13 0.0263 531.5738
2 -1.4166e-13 0.0242 288.6942
3 4.7184e-14 0.0257 662.3287
4 -1.2557e-13 0.0258 580.8589
5 -1.3800e-13 0.0277 1047.8477
6 3.2380e-13 0.0269 1079.3151
7 -8.4583e-11 0.0496 242.0390
8 3.4867e-13 0.0266 1129.9241
Iterations = 512000000
NullTime (usec) = 0.0004
MFLOPS(1) = 353.9761
MFLOPS(2) = 459.6483
MFLOPS(3) = 700.2343
MFLOPS(4) = 867.4086
Rolled Single Precision Linpack
norm. resid resid machep x[0]-1 x[n-1]-1
1.9 4.52336171e-05 1.19209290e-07 -1.31130219e-05 -1.30534172e-05
times are reported for matrices of order 100
dgefa dgesl total kflops unit ratio
times for array with leading dimension of 201
0.00 0.00 0.00 1049949 0.00 0.01
0.00 0.00 0.00 1049949 0.00 0.01
0.00 0.00 0.00 1037261 0.00 0.01
0.00 0.00 0.00 1058038 0.00 0.01
times for array with leading dimension of 200
0.00 0.00 0.00 959033 0.00 0.01
0.00 0.00 0.00 957694 0.00 0.01
0.00 0.00 0.00 953703 0.00 0.01
0.00 0.00 0.00 963202 0.00 0.01
Rolled Single Precision 963202 Kflops ; 10 Reps
Rolled Double Precision Linpack
norm. resid resid machep x[0]-1 x[n-1]-1
1.7 7.41628980e-14 2.22044605e-16 -1.49880108e-14 -1.89848137e-14
times are reported for matrices of order 100
dgefa dgesl total kflops unit ratio
times for array with leading dimension of 201
0.00 0.00 0.00 922939 0.00 0.01
0.00 0.00 0.00 927928 0.00 0.01
0.00 0.00 0.00 927928 0.00 0.01
0.00 0.00 0.00 925801 0.00 0.01
times for array with leading dimension of 200
0.00 0.00 0.00 884880 0.00 0.01
0.00 0.00 0.00 894097 0.00 0.01
0.00 0.00 0.00 892935 0.00 0.01
0.00 0.00 0.00 892122 0.00 0.01
Rolled Double Precision 892122 Kflops ; 10 Reps
dry2: 10000000 runs
Microseconds for one run through Dhrystone: 0.2
Dhrystones per Second: 4950495.0