tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: PATCH libatomic



On 08.05.2020 00:49, maya%NetBSD.org@localhost wrote:
> I am under the impression that (at least GCC) compilers will not emit
> intrinsic calls if they are guaranteed to be available on the target.
> 
> This means libatomic needs to:
> 
> - Optimize: we can runtime detect, which emitted code cannot do.
> 
> Note that this means providing this libatomic will cause us to stop
> noticing 64-bit atomics used when compiling for -march=i486, our default
> for i386. We will stop upgrading those to -march=i586 and users will see
> a performance penalty.
> 

A runtime detection could be a part of ifunc (is it ready for NetBSD?).

The standard C/C++ feature is to detect whether atomic operations are
real (lock-free) through atomic_is_lock_free(). This is a feature, not a
bug (as claimed by some people). atomic_is_lock_free() can be overloaded
in libatomic and detect CPU type in runtime and redirect either to real
CPU intrinsic of lock-free fallback.

My code is a proof-of-concept, presenting that it's not that terribly
complicated.

Possibly the best approach (unless someone is interested in inventing
native libatomic) is to use the llvm runtime (with prior upstreaming of
local patches) for MKLLVM=yes and gcc runtime for MKGCC=yes.

> - Provide the fallback code
> 
> And that it isn't necessary for libatomic to:
> 
> - Attempt to cause the compiler to emit the intrinsic
> 
> Which should make this code a lot simpler.
> 
> ---
> 
> +#define LOCK_FREE_ACTION(type)                                                 \
> +  return atomic_compare_exchange_strong_explicit(                              \
> +      (_Atomic(type) *)ptr, (type *)expected, *(type *)desired, success,       \
> +      failure)
> +  LOCK_FREE_CASES();
> +#undef LOCK_FREE_ACTION
> 
> This feels a bit offensive...
> Mentally I am reading this as "I don't believe the compiler will
> optimize out some scenarios in cleaner code with static inline, so
> forcing the optimization to happen via C preprocessor".
> I wonder if it's really true.
> 
> The macros seem overly complicated to avoid generics, I don't think
> pre-C11 is a concern for us. I wonder if it can be simplified.
> 
> 

libatomic macros generate code like:

__atomic_fetch_add_1
__atomic_fetch_add_2
__atomic_fetch_add_4
__atomic_fetch_add_8

__atomic_fetch_sub_1
__atomic_fetch_sub_2
__atomic_fetch_sub_4
__atomic_fetch_sub_8

__atomic_fetch_and_1
__atomic_fetch_and_2
__atomic_fetch_and_4
__atomic_fetch_and_8

etc, all reducing repetitions.

I don't know how to write this code differently. If we compare the
length of stdatomic.h with the length of atomic.c, they are comparable
so it's not that bad.

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index