Subject: Re: scheduler woes on MPACPI kernel
To: Manuel Bouyer <email@example.com>
From: Peter O'Kane <firstname.lastname@example.org>
Date: 01/19/2005 17:15:34
Good point. Further results here running make -j8 to build a 2.0 kernel:
Real User System Configuration
288.12 255.15 35.97 Single processor no HT
251.72 440.61 49.55 Single processor with HT
165.58 260.00 40.74 Twin physical processors no HT
151.54 435.04 69.11 Twin physical processors with HT
again running make -j16
160.9 263.44 44.24 Twin physical processors no HT
146.46 452.66 78.69 Twin physical processors with HT
and finally make -j32
158.82 260.33 42.63 Twin physical processors no HT
146.23 465.37 77.8 Twin physical processors with HT
Note that the user/system time required to do the job remains fairly
constant at about 250-260/30-45 seconds with HT off. With HT enabled the
user/system time required rises to about 440-460/50-70 seconds.
It appears that HT gives about 10% overall performance boot for many tasks
but the effective processor speed seen by a single thread is 55%-60% of the
speed of the processor in non HT mode.
--On 19 January 2005 12:06 +0100 Manuel Bouyer <email@example.com>
> On Wed, Jan 19, 2005 at 10:59:08AM +0000, Peter O'Kane wrote:
>> For what it's worth here are some kernel compile times on a dual Xeon
>> 2.4GHz system in different configurations. (NetBSD 2.0):
>> Single physical processor HT disabled:
>> Real: 370.40s User: 242.6s System: 27.57s (make)
>> Single physical processor HT enabled:
>> Real: 306.13s User: 374.68s System: 41.34s (make -j2)
>> Two physical processors HT disabled:
>> Real: 236.42s User: 248.66 System: 33.8s (make -j2)
>> Two physical processors HT enabled:
>> Real: 173.6s User: 377.9s System: 53.51s (make -j4)
>> Real: 151.54s User: 435.04s System: 69.11s (make -j8)
>> Real: 146.47s User: 452.66s System: 78.69s (make -j16)
>> This system has all it's file systems on a 500G raid 5 array which makes
>> it quite i/o bound. There is 3G of RAM so there is no paging traffic.
> It would have made more sense to use the same -j flag (e.g. -j8) for all
> your tests. Even for a non HT single processor, -j8 will be faster than
> -j1, because it will spend less time waiting on I/O.
> Also, I suspect that the Xeon's HT isn't exactly the same as P4's HT.
> The virtual processors may share more things in the P4 than in the xeon.
> Manuel Bouyer <firstname.lastname@example.org>
> NetBSD: 26 ans d'experience feront toujours la difference
Peter O'Kane E-mail:email@example.com
Information Technology Department, Voice: +353 91 492527
National University of Ireland, Galway. Fax: +353 91 494501