tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Processor sets, affinity, real-time extensions

On Thu, Jan 10, 2008 at 02:21:24AM +0200, Mindaugas R. wrote:

> here is the implementation of processor-sets and CPU affinity calls. Also,
> this implements POSIX real-time extensions like: pthread_getschedparam(),
> pthread_setschedparam(), pthread_setschedprio(), sched_setscheduler(),
> sched_getscheduler(), etc. A userland utility schedctl(1) is provided too.
> The processor-sets are compatible with Solaris API:
>   int pset_assign(psetid_t, cpuid_t, psetid_t *);
>   int pset_bind(psetid_t, idtype_t, id_t, psetid_t *);
>   int pset_create(psetid_t *);
>   int pset_destroy(psetid_t);
> Two non-portable pthread affinity calls (compatible with Linux):
>   int pthread_getaffinity_np(pthread_t, size_t, cpuset_t *);
>   int pthread_setaffinity_np(pthread_t, size_t, cpuset_t *);
> Unless nobody objects, I would like to start merging these sources.
> Few things to mention about the implementation:
> 1. Instead of providing separate system calls for setting the priority and
> scheduling class (policy, as defined by POSIX), there is an internal system
> call _sched_setparam() to pass all parameters in a structure:
> struct sched_param {
>       int     sched_class;
>       int     sched_priority;
> };
> Is there anything potentially wrong with this?
> 2. There is another internal system call:
> int _pset_bind(idtype_t idtype, id_t first_id, id_t second_id,
>              psetid_t psid, psetid_t opsid);
> It is reasonable to provide a possibility for administrator to bind any
> threads via userland utility. At this point, there is a need of two IDs (eg.
> one for PID, and other for LID). I do not like such design i.e. to use two ID
> arguments in syscall, however I am not sure about a better way. Thoughts?
> 3. There is a kernel function lwp_migrate(), which might be used for generic
> migration of thread from one CPU to another. The problematic case when thread
> is on LSONPROC state. In such case lwp::l_target_cpu is set, and migration
> is performed in mi_switch(). However, this increases the complexity of
> mi_switch(). One of the alternatives would be migration queues, but it has
> few disadvantages:
> 1) this is needed only for SCHED_M2 scheduler because of per-CPU locks;
>    migration with SCHED_4BSD is trivial;
> 2) there is no easy way to abstract and close this in the scheduler.
> Comments?

Thanks for doing this, these are some pretty cool features to finally have!
I have some comments, which we discussed in private:

o 'nice' doesn't work with SCHED_M2, which is a regression. Asking people
  to use schedctl doesn't really wash, because nice is specified by POSIX,
  works on every other Unix type system and has been around for over 20

o SCHED_4BSD doesn't provide the new features and I'm strongly of the
  opinion that's a bug. By providing pluggable schedulers we gave people
  options. By fragmenting the feature set by scheduler we are taking those
  options away again.

o I mentioned that I don't like how we deal with on-processor migration
  in mi_switch() because it's complicated - and mi_switch() is already too
  complicated. I'll try to think of a better way to handle it.


Home | Main Index | Thread Index | Old Index