tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD thread model: a few questions

On Fri, Jan 16, 2009 at 06:32:21PM +0100, Jean-Yves Migeon wrote:

> - is it valid for a LWP to call kpreempt_disable() but still use 
> synchronisation primitives in the non-preemptible part, thus resulting 
> in voluntary context switches?

Yes it's valid. Ideally you would take the lock first and then disable
preemption, especially if it's known that the lock could cause the caller to
spin-wait for some time. I realise that's not always possible or easy; it's
the ideal pattern.

> - when a LWP is bound to a CPU and that CPU is put offline, bound 
> threads may still execute on it.


> Is there a way to correctly assert that bound threads running on a CPU 
> do not hold any locks when we are about to save the CPU context, to 
> avoid any possible deadlock between bound LWP and LWP waiting for locked 
> ressources?

You can only check that any bound threads are in a quiescent state. That can
be done using checkpoints. As an example, user threads can be stopped using
SIGSTOP and other signals. The threads stop only if at a defined checkpoint:
userret(). We know that they will hold no locks if calling that routine.
This prevents stopped threads from deadlocking the system. Thread suspention
with _lwp_suspend() works similarly.

For kernel threads it is not that simple, because kernel threads - including
soft interrupts:

- do not return to user space 
- do not accept signals
- do not have a common routine that can be used as a checkpoint

> Or is it acceptable to alter the affinity of bound threads arbitrarily
> and restore it later?

No: bound threads access per-CPU data structures.

You want to do complete CPU offline, right? I will describe it as 'pause'
because we already have an 'offline' :-). For userspace threads that have
affinity or are part of a processor set any pause would need to be prevented
and an error returned.

For kernel activity and bound threads here is my suggestion:

- Offline the cpu when beginning pause (SPFC_OFFLINE).

- Scan the callout table on the target CPU and migrate all its callouts to a
  CPU that will still be online+unpaused. Ensure that callouts will not be
  migrated back onto the target CPU while it is _offline_. That could be as
  simple as a check for the SPCF_OFFLINE flag. See kern_timeout.c.

- Tweak subr_workqueue.c so it deals with _offlined_ CPUs. It should move
  any existing work on any existing workqueues to another CPU and re-route
  any new work that is enqueued.

- Prevent device interrupts (timer included) occurring on the pausing CPU.
  They would need to be rerouted or in the case of the timer, disabled

- Take the CPU out of any masks that would cause broadcast IPIs to affect

- Add a routine in subr_xcall.c to ensure that any active xcalls are
  flushed. Once flushed, set a flag to prevent further _broadcast_ xcalls
  affecting the pausing cpu - while holding xc_lock!

- Adapt the logic in softint_disestablish() to wait for _any_ active soft
  interrupts on the CPU. Use this.

- At this point, nothing else will want/need to allocate memory on the
  pausing CPU, so drain all per-CPU pool_cache components from the CPU. See
  subr_pool.c. Search for 'xcall'. It would need to use unicast xcalls.

- Do a unicast xcall to SPCF_PAUSED on the CPU at this point.

- The pausing CPU should now be sitting in idle_loop(). We can't save its
  state with an IPI because it could hold locks (x86 suspend/resume code
  does this - it's broken as a result, please ignore it). I suggest adding
  something like this to idle_loop():

        spc = &ci->ci_schedstate;
        while (__predict_false((spc->spc_flags & SPCF_PAUSED) != 0)) {
                spc->spc_flags |= SPCF_PAUSELOOP;
                /* maybe call md function */

- The function that is coordinating pause of the CPU could then sleep for 1
  tick in a loop, testing spc->spc_flags until SPCF_PAUSELOOP goes non-zero.
  At that point it is safe to kill the CPU.

- When starting the CPU back up, clear SPCF_PAUSELOOP with a unicast xcall.

Home | Main Index | Thread Index | Old Index