tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 4.x -> 5.x locking?

On Wed, Nov 09, 2011 at 02:40:19AM -0500, Mouse wrote:
 > I found mutex(9), condvar(9), and the like.  But it is not clear to me
 > what I need to do to be MP-ready.  Do I need to use the stuff from
 > mb(9), or membar_ops(3), or what?  It's not clear from the manpages
 > whether, for example, membar_enter is usable within the kernel; the
 > reference from mutex(9) seems to imply so, but I've been surprised
 > before.

When you sort this out, please suggest doc fixes... :-/

(against -5 is fine, although I suspect some of these pages have
already been improved in head)

 > with respect to (for example) mutex operations, and it's not clear
 > whether the "other memory accesses" includes accesses by other
 > processors.  I could have the other processor do a membar_enter() after
 > taking the mutex, but, again, it's not clear whether the accesses the
 > manpage talks about refer to "this CPU" or "any CPU".  ("Any CPU" is
 > more useful here (and probably mroe expensive), but "this CPU" is what
 > I'd expect from what I've read of memory barriers in CPU documentation.)

It is actually "any CPU". You don't need memory barriers on a
uniprocessor machine (except for accessing I/O registers, but usually
one accesses those entirely uncached) because the instruction
reordering done by superscalar processors is arranged so it doesn't
produce visible effects. So far, anyway. This might of course change
next year.

It's nonetheless a different problem from explicit cache coherence
control; that is, the cache is coherent, so writes once made will be
seen by other processors; we just need to enforce ordering on (some
of) the writes so that the values observable in memory doesn't become
inconsistent with each other.

(That is, given

   static int x, y;

   static void foo(void) {
      x = 1;
      y = 1;

   int main() {
      while (1) {
         int yy = y;
         int xx = x;
         printf("%d %d\n", xx, yy);

since there are no memory barriers it's possible on some platforms for
y to be stored before x, which can then result in a printout of "0 1",
which is ostensibly impossible. It's also potentially possible for x
to be written before y, and then for x to be *read* before y, whicih
can also result in a printout of "0 1".)

I've often thought that if we're going to need to have memory barriers
we may as well also include explicit cache control, since you need
pretty much the same logic in the same places; the difference at the
moment is that explicit cache control requires identifying regions of
memory to handle whereas memory barriers can, for the time being at
least, afford to apply indiscriminately to all pending memory

 > The mb(9) page specifically warns that it does not entail any promises
 > about pushing stores to visibility by other processors, so I don't
 > think it's useful here - am I wrong?

That is probably intended to caution against assuming cache coherence,
and if so should be reworded.

 > And, finally, with reference to the membar_ops(3) page, what does it
 > mean for a load to "reach global visibility"?

A load? Nothing. It's presumably meant to say something like "will
access what is globally visible" and should be reworded...

David A. Holland

Home | Main Index | Thread Index | Old Index