tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: mutexes, locks and so on...



On 11/12/10 03:58, David Holland wrote:
On Thu, Nov 11, 2010 at 06:35:55PM +0100, Johnny Billquist wrote:
  >>  [spl mutex stuff]
  >
  >  Hum? So that was introduced with the new locking code then? Because
  >  back when we used splraise/splx, the above would not have worked.

This would have worked:

    s = splvm();
    slow_stuff_1();
    s2 = splhigh();
    fast_stuff();
    splx(s2);
    slow_stuff_2();
    splx(s);

When converted to the current mutex model, this runs slow_stuff_2() at
splhigh, which is probably undesirable. The conclusion is that there
isn't much code like this and any that remains should be rewritten.

Indeed. That was what I was thinking of/assuming the code looked like. The new implementation means it's become less optimal (but maybe not significantly so). But I would assume this is the way codepaths looked like.

There's also this; it would have been considered abusive in the old
days but that doesn't mean nobody ever wrote it:

    s = splhigh();
    fast_stuff();
    splvm();
    slow_stuff();
    splx(s);

Assuming you'd replace splhigh()/splx() with mutex_spin_{enter,exit}, the code would continue to work just as well, and the same way in the new world too.

What the new world do, however, is keep a single global variable (per cpu) for the previous ipl, as well as a global mutex counter, to see if we are exiting the last mutex, at which point the ipl will be restored.

This means we now have a global variable which must be made sure that it's updated and maintaned atomic enough. I think it's an ugly hack, but as I said, maybe people now are writing code that depends on this behviour, and it can't be gotten rid off.

  >  >Also, AFAIK effect was was actually measured during newlock2 branch
  >  >development.  Supporting such micro-optimisation would mean quite a bit
  >  >of complexity in critical mutex(9) interface with practically no gain.
  >
  >  My initial reaction was that each mutex should have just stored
  >  what ipl it came from, and just restored it back to that when the
  >  mutex was released. That would also allow us to skip the mutex
  >  count in cpu_info, as well as the global saved ipl in cpu_info, but
  >  then I realized that this solution would break if people actually
  >  wrote code like
  >  lock(a)
  >  lock(b)
  >  release(a)
  >  release(b)

...which is very common.

It is? I would have thought (and hoped) that people normally did:
lock(a)
lock(b)
unlock(b)
unlock(a)

Keeping track of how many times each spl level has been asserted and
lowering to the right level at each mutex release does not add as much
complexity as rmind claims, and I'm not convinced it costs much either
(we've debated this before), but the model we have is only untidy, not
wrong, and I'm not interested enough to spend time benchmarking and
tuning to fix it.

I agree that it's not wrong, but untidy. Keeping track of ipl levels could have been kept within the mutex instead, thus simplifying both the lock and unlock code, at the expense that people actually had to unlock mutexes in the reverse order they acquired them. Just as with the splraise/splx before.

But I doubt I will have the energy to go through all of NetBSD to see how mutexes are used, to spot (and possibly fix) any that would break such a rule. :-(

  >  Hmm. Hard to argue about this. While I think it's nice if we try to
  >  keep the kernel agnostic, the user api is not something I'm arguing
  >  about changing. But it would be nice if we could leave it possible
  >  to do things in different ways when it's not really the effect of a
  >  CAS that is needed, but we'd still like to keep it in a way that
  >  allowed the compiled code to be as nice and efficient as possible.

It usually is a CAS that's needed. However, if it's worthwhile there's
no real reason the vax MD code couldn't provide its own versions of
higher-level actions that are by default written using CAS. There are
other old/slow platforms that could probably benefit from such hooks
if someone wanted to take the trouble to set things up accordingly.
The only question is whether it's really worth the maintenance
overhead of allowing it.

I might think a bit more about this. However, the way the rwlock interface looks right now, it is not that simple to just provide an alternative implementation. But it might be worth the effort.

        Johnny


Home | Main Index | Thread Index | Old Index