tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Making run queues independent of the pluggable scheduler



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Doran wrote:
| Hi,
|
| The diff below extracts the per-CPU run queue code from the M2
scheduler and
| makes it non optional, removing the 4BSD scheduler's global run queue.
With
| the patch, it means that the pluggable scheduler is responsible only for
| adjusting the priority of timeshared jobs.
|
| Reasons for doing this:
|
| - 4BSD gains processor sets/affinity, although I haven't tested that yet.
| - 4BSD gets a huge performance boost on producer/consumer workloads like
|   sysbench OLTP.
| - We have less code to maintain.
|
| There are a couple of other changes:
|
| - It makes sched_enqueue responsible for causing a preemption if needed.
|   Previously this was left up to the caller and was only done at one site
|   (sleepq_remove).
|
| - It changes the CPU selection algorithm slightly. Weak affinity is not
|   considered until the job has context switched a preset number of times,
|   currently 5. This is to try and better distribute jobs among the
CPUs.  It
|   uses the new call idle_pick to find an idle CPU if possible. If no idle
|   CPUs, it does a circular scan of CPUs instead of always starting at the
|   first CPU. That's to try and ensure that we don't unfairly overload one
|   CPU. I will make the CPU selection changes a seperate commit if they
have
|   been demonstrated to be worthwhile.

That sounds like the new CPU selection algorithm runs most efficient
on a single-socket multi-core machine. Can you elaborate how it is
intended to scale on NUMA machines, please ?

|
| ... and a couple of notes:
|
| - Some or all of the items in runqueue_t could be safely merged into
|   schedstate_percpu, but I think it's better to integrate things piecemeal
|   if possible.
|
| - Previously M2's per-CPU approach performed poorly on build.sh but with
|   yesterdays changes to rwlocks and turnstiles it matches the global run
|   queue used by 4BSD. This shows the number of seconds to complete
build.sh
|   -j16 release on an 8-core machine: http://www.netbsd.org/~ad/sched2.png
|
| http://www.netbsd.org/~ad/sched.diff
|
| Comments?
|
| Thanks,
| Andrew

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQEcBAEBAgAGBQJH99fUAAoJEH5bnJXp/MoMYG8IAJWa75Ws6+Ta664Z8HfmkDKF
uH9O8ersd4wAQM9RuYUweAt2rqjCVLYdnTBthqYjLeopepAL7LKBZiytcXd7Xm11
KLd4ZaZsMJch6XeYZGi49goz6uOEzTeRYpo7ZTnJZjw7kODcKnfVUY7sjvA7pCtS
J/v+kRE+H1WlBF3lfcSslAnYyp07fIPmKCzqwF4JhHfpmIlfELU7Mv3EvrrGreOp
xepKzID0XGNAok2BYXVpjUyimbbagU9vAU+3rvZvh76oCzyLuZKCjEig/bUvRnTB
wu3a2XvfqZHj5mngC8B3c7Jc4b3PUzAFoa/09TT+WLD0yRfWged1O1pILDy9lZY=
=vk3V
-----END PGP SIGNATURE-----


Home | Main Index | Thread Index | Old Index