tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

MUTEX_CAS() and memory barriers



Background: The kernel mutex implementation has a pretty generic implementation that can be used on any platform that can provide a pointer-sized atomic compare-and-swap primitive.  Platforms provide a definition of MUTEX_CAS() that expands to the right thing for that platform.

For the most part, architectures define it one of two ways:

<type 1>
int     _lock_cas(volatile uintptr_t *, uintptr_t, uintptr_t);
#define MUTEX_CAS(p, o, n)              _lock_cas((p), (o), (n))

<type 2>
#define MUTEX_CAS(p, o, n)              \
    (atomic_cas_ulong((volatile unsigned long *)(p), (o), (n)) == (o))


For the <type 1> cases, we have:

-> alpha (_lock_cas() is basically like atomic_cas_ulong() but has memory barrier insns and different return value semantics)
-> powerpc (like alpha, but has an IBM405 errata as well)
-> sh3 (_lock_cas() is a restartable atomic sequence that the interrupt handler groks - it is aliased to the normal atomic_cas_*() functions)
-> sparc64 (same situation as alpha)

Now, as for the 2 uses of MUTEX_CAS() in kern_mutex.c:

<MUTEX_ACQUIRE()>
        rv = MUTEX_CAS(&mtx->mtx_owner, oldown, newown);
        MUTEX_MEMBAR_ENTER();

<MUTEX_SET_WAITERS()>
        rv = MUTEX_CAS(&mtx->mtx_owner, owner, owner | MUTEX_BIT_WAITERS);
        MUTEX_MEMBAR_ENTER();

…and for the platforms that need it, MUTEX_MEMBAR_ENTER() expands to membar_enter().

So, for all platforms that require memory barriers, a memory barrier is already issued after the MUTEX_CAS().

So, there are a couple of takeaways here:

1. Some platforms have redundant memory barriers in their mutex implementations (one in _lock_cas() and another at the _lock_cas() call site).

2. kern_mutex.c issues memory barriers *even if the CAS failed*.  This is probably not a big deal, but it still rubs me the wrong way :-)

Anyway, I’m much more concerned with (1).  I think at the very least, alpha and sparc64 don’t need to define their own _lock_cas() and can just use atomic_cas_ulong()… furthermore, I think we can just let that be the default definition unless a platform has a REALLY good reason to override it (I mean, not even sh3 has to do so, because it aliases _lock_cas() to atomic_cas_ulong()).

Thoughts?

-- thorpej



Home | Main Index | Thread Index | Old Index