tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: PATCH libatomic



On Mon, May 11, 2020 at 01:11:32AM +0200, Kamil Rytarowski wrote:
> On 10.05.2020 18:38, Kamil Rytarowski wrote:
> >  LLDB will be patched to avoid atomics.
> I have checked LLDB and std::atomic<uint64_t> is used on purpose and was
> switched from mutexes 3 years ago.
> 
> https://github.com/llvm/llvm-project/commit/f9d16476573e16856bdb3250c817b0a2c631d2b1
> 
> Reverting this (or rewriting) is not viable as this change improved the
> performance, the code was changed meanwhile and there were added two
> more associated std::atomic<> variables. LLDB also requires recent C++
> runtime.

Using 64bit atomics when they are present is fine. When are not
available, it is a huge hidden cost. This problem was discussed on the
LLDB mailing lists a while ago and there are essentially four possible
approaches:

(1) If there are no 64bit atomic ops, go back to using the explicit
mutex. This makes the cost explicit at the very least.
(2) Use 32bit atomics with reduced resolution when there are no 64bit
atomic ops available. This needs some careful analysis on how much
precision is actually necessary.
(3) Use a pair of 32bit atomics and handle overflow from one timer to
the other. This introduces a small race condition when reading the
timers, but IIRC it is ruled out by outside contraints.
(4) Use TLS for the regular timer operation and aggregate them on thread
exit. This might not be an option here depending on the thread life
time.

The main reason why I didn't just submit a patch the last time this came
up is it is a hard decision on which approach works best here. It's easy
to write the code, but not ensuring that it is the best approach. I
think (3) is entirely fine for LLDB's use, but it is not a magic general
solution. Note that the cases where the race condition happen are a
small subset when the bad scheduling could mess up the time measurement
(e.g. the purpose of the code) completely. For all practical purposes,
the code with atomics here is supposed to be lock-free and ensures
different threads can't block each other. That's quite important when
taking nanosecond timing. If it wasn't relevant, scaling to milli or
micro seconds would generally avoid the need for 64bit atomics after
all.

Joerg


Home | Main Index | Thread Index | Old Index