Subject: Scheduler project status and further discussion
To: None <tech-kern@netbsd.org>
From: Daniel Sieger <dsieger@TechFak.Uni-Bielefeld.DE>
List: tech-kern
Date: 01/14/2007 20:49:14
--Kj7319i9nmIyA2yE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi folks,
here's a quick summary of the current status of my scheduler project
as well as some questions about where to go from here. The good news
is that I've accomplished the minimal goals for my university
project. That is:

1. There is a first scheduler API which allows different algorithms to
   be implemented.
2. I've implemented one other scheduler, though this is only a dumb
   fixed priority scheduler, but with more runqueues than our current
   scheduler.

Patches can be found at [1]. csf.diff is against -current and
includes idle lwp stuff, scheduler.diff is agaist the idle lwp patch
for easier review. The above mentioned test scheduler is not included,
since it is admittedly quite hackisch and not relevant for further
progress (although I intend to port the increased number of runqueues
to sched_4bsd).

In short, the patch does the following:

- Seperate functions and definitions specific to the 4.4BSD scheduler
  from those independent of scheduler implementation.
 =20
- Define a first scheduler API in sched.h

- Adapt 4.4BSD scheduler as well as the rest of the kernel to the=20
  interface.

- Add a kernel option to select which scheduler to use at
  compile-time.

- Make userland independent from scheduler implementation (e.g. top(1)
  and ps(1)).

The CHANGES file contained in the tarball has some more details.

However, while this is enough for my project, it's not for
NetBSD. :) There are a couple of issues and questions I'd like to
discuss:

1. What to do about priority range? It is currently fixed to
   0..127. Different scheduler implementations may want to use a
   different priority range. But I'm not sure if allowing a variable
   priority range is really sane. All other systems have a fixed range
   (FreeBSD has 255, Solaris has 169, Linux has 139), IIRC.

   Personally, I'd vote to increase MAXPRI at least to 159, which
   should be fairly enough for any scheduler.

   Furthermore we should discuss about how to split up priority range
   into separate classes, e.g. something like (suggested by ad@)

     user
     kernel (user processes sleeping in the kernel, kernel threads)
     realtime (realtime threads, in user space or the kernel)
     interrupt (interrupt threads)

   How Solaris, FreeBSD and Linux handle this can be found in [2].

2. Do we want to have SystemV scheduling classes? While this approach
   has some very nice features and is quite flexible, it does not come
   without a cost. We'd need to introduce a whole bunch of additional
   abstractions, which could make things much more complicated and
   could possibly have a negative impact on performance ([2] also
   mentions this).

Finally, here's what I intend to do next:

- Port idle lwps to Andrew's newlock2 branch. This also resolves some
  other issues I have with the patch against -current (sleepq stuff).

- Port schduler API to newlock2.

- Add locking rules to scheduler API. We need to make sure certain API
  functions are only called with appropriate locks held etc. Any
  suggestions how to do this best are more than welcome.

- Eventually implement SystemV scheduling classes.

Regards,
Daniel

[1]: http://wwwhomes.uni-bielefeld.de/dsieger/csf-0.1.tar.bz2
[2]: http://www.opensolaris.org/os/article/2005-10-14_a_comparison_of_solar=
is__linux__and_freebsd_kernels/

--
Daniel Sieger
Faculty of Technology
Bielefeld University
wwwhomes.uni-bielefeld.de/dsieger

--Kj7319i9nmIyA2yE
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (SunOS)

iD8DBQFFqok6JUKmeYzbnToRApNtAKCxVXYpO3DIwkbnHAllkzzf7zc3nQCfc2FS
3723WUSMHQj5dG2WmbmHDMg=
=JmiF
-----END PGP SIGNATURE-----

--Kj7319i9nmIyA2yE--