Processor sets, affinity, real-time extensions


here is the implementation of processor-sets and CPU affinity calls. Also,
this implements POSIX real-time extensions like: pthread_getschedparam(),
pthread_setschedparam(), pthread_setschedprio(), sched_setscheduler(),
sched_getscheduler(), etc. A userland utility schedctl(1) is provided too.

The processor-sets are compatible with Solaris API:

  int pset_assign(psetid_t, cpuid_t, psetid_t *);
  int pset_bind(psetid_t, idtype_t, id_t, psetid_t *);
  int pset_create(psetid_t *);
  int pset_destroy(psetid_t);

Two non-portable pthread affinity calls (compatible with Linux):

  int pthread_getaffinity_np(pthread_t, size_t, cpuset_t *);
  int pthread_setaffinity_np(pthread_t, size_t, cpuset_t *);

Unless nobody objects, I would like to start merging these sources.
Few things to mention about the implementation:

1. Instead of providing separate system calls for setting the priority and
scheduling class (policy, as defined by POSIX), there is an internal system
call _sched_setparam() to pass all parameters in a structure:

struct sched_param {
        int     sched_class;
        int     sched_priority;

Is there anything potentially wrong with this?

2. There is another internal system call:

int _pset_bind(idtype_t idtype, id_t first_id, id_t second_id,
               psetid_t psid, psetid_t opsid);

It is reasonable to provide a possibility for administrator to bind any
threads via userland utility. At this point, there is a need of two IDs (eg.
one for PID, and other for LID). I do not like such design i.e. to use two ID
arguments in syscall, however I am not sure about a better way. Thoughts?

3. There is a kernel function lwp_migrate(), which might be used for generic
migration of thread from one CPU to another. The problematic case when thread
is on LSONPROC state. In such case lwp::l_target_cpu is set, and migration
is performed in mi_switch(). However, this increases the complexity of
mi_switch(). One of the alternatives would be migration queues, but it has
few disadvantages:
1) this is needed only for SCHED_M2 scheduler because of per-CPU locks;
   migration with SCHED_4BSD is trivial;
2) there is no easy way to abstract and close this in the scheduler.

Best regards,

