tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Making run queues independent of the pluggable scheduler
Hi,
The diff below extracts the per-CPU run queue code from the M2 scheduler and
makes it non optional, removing the 4BSD scheduler's global run queue. With
the patch, it means that the pluggable scheduler is responsible only for
adjusting the priority of timeshared jobs.
Reasons for doing this:
- 4BSD gains processor sets/affinity, although I haven't tested that yet.
- 4BSD gets a huge performance boost on producer/consumer workloads like
sysbench OLTP.
- We have less code to maintain.
There are a couple of other changes:
- It makes sched_enqueue responsible for causing a preemption if needed.
Previously this was left up to the caller and was only done at one site
(sleepq_remove).
- It changes the CPU selection algorithm slightly. Weak affinity is not
considered until the job has context switched a preset number of times,
currently 5. This is to try and better distribute jobs among the CPUs. It
uses the new call idle_pick to find an idle CPU if possible. If no idle
CPUs, it does a circular scan of CPUs instead of always starting at the
first CPU. That's to try and ensure that we don't unfairly overload one
CPU. I will make the CPU selection changes a seperate commit if they have
been demonstrated to be worthwhile.
... and a couple of notes:
- Some or all of the items in runqueue_t could be safely merged into
schedstate_percpu, but I think it's better to integrate things piecemeal
if possible.
- Previously M2's per-CPU approach performed poorly on build.sh but with
yesterdays changes to rwlocks and turnstiles it matches the global run
queue used by 4BSD. This shows the number of seconds to complete build.sh
-j16 release on an 8-core machine: http://www.netbsd.org/~ad/sched2.png
http://www.netbsd.org/~ad/sched.diff
Comments?
Thanks,
Andrew
Home |
Main Index |
Thread Index |
Old Index