tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Anomalies while handling p_nstopchild count
Date: Tue, 13 Oct 2015 08:42:55 +0700
From: Robert Elz <kre%munnari.OZ.AU@localhost>
The code is (pre fix) ...
p->p_waited = 0;
membar_producer();
p->p_stat = SSTOP;
My assumption is that the objective was to ensure that no-one would
ever see p_stat == SSTOP; before the p_waited = 0 is also visible.
That can be important, as p_lock (nor any other relevant lock) is not
acquired before inspecting p_stat (see for example find_stopped_child())
so there's nothing preventing another CPU from racing against this pair of
assignments, which without a memory barrier of some kind could be
reordered by the hardware, and detecting p_stat == SSTOP and p_waited != 0.
I don't believe that membar_consumer() is relevant at all, no-one
(for this purpose) cares what order reads from memory actually happen.
In this case, and generally[*], you need a membar_consumer matching
the membar_producer in order for the latter to have any useful effect.
Otherwise, if another CPU executes code that looks like
if (p->p_stat == SSTOP && p->p_waited == 0 && ...) ...;
such as in find_stopped_child, then the following sequence of events
might happen, due, for example, to CPU instruction reordering:
CPU 0 CPU 1
tmp0 = p->p_waited // gets 1
p->p_waited = 0
tmp1 = p->p_stat
p->p_stat = SSTOP
if (tmp1 == SSTOP && tmp0 == 0 && ...)
The membar_consumer forces CPU 1 not to reorder loads like that for
any reason.
On closer inspection, it looks like:
(a) The protocol is that under proc_lock but *not* p->p_lock, p_waited
must be written before p_stat, and p_stat must be read before
p_waited, as an optimization to avoid mutex/atomics.
(b) There's a missing membar_consumer in find_stopped_child and in
proc_reparent (but maybe it doesn't matter on x86, if all current
implementations will not in practice reorder those loads).
(c) The write to p->p_waited in exit1 looks fishy: it holds p->p_lock,
but not proc_lock.
That said, there is no missing membar_producer in fork1, kpsignal2,
proc_stop_done, proc_unstop, or sys_ptrace, because they all hold both
proc_lock and p->p_lock.
[*] One exception: data-dependent reads, as in q = *p; v = *q, need
only membar_datadep_consumer, which is a noop on most platforms,
unlike membar_consumer.
Paul's question really amounted to, I think, whether either the fact that
proc_lock will be held while p_waited is set to 0 would make a difference
(I can't see why that would really matter) or more likely, since the code
will become
p->p_waited = 0;
p->p_pptr->p_nstopchild++;
mutex_exit(proc_lock);
membar_producer();
p->p_stat = SSTOP;
whether the mutex_exit() makes the membar_producer() redundant.
And the answer to that is yes.
Home |
Main Index |
Thread Index |
Old Index