Actually it happened that modifiying pthread_atfork() to stop
malloc()ing is enough to address the problem.
I have landed the changes and removed '#if 0' kludge.
Thanks!
On 01.02.2020 13:59, Kamil Rytarowski wrote:
> On 31.01.2020 22:10, Andrew Doran wrote:
>> On Fri, Jan 31, 2020 at 06:55:00PM -0000, Christos Zoulas wrote:
>>
>>> In article <724af477-010b-9ddf-6ece-e23d7cf59079%gmx.com@localhost>,
>>> Kamil Rytarowski <n54%gmx.com@localhost> wrote:
>>>> -=-=-=-=-=-
>>>> -=-=-=-=-=-
>>>>
>>>> On 31.01.2020 03:38, Christos Zoulas wrote:
>>>>> And it is fixed now.
>>>>>
>>>>> christos
>>>>>
>>>>
>>>> OK. I am going to submit a bug report upstream and get some feedback
>>>> what is the way forward here, delaying initialization.
>>>
>>> I think that the way forward (on our side) is to do away with libpthread,
>>> merge it with libc and kill all the stub nonsense.
>>
>> Agreed.
>>
>> pthread__init() does some expensive stuff like _lwp_ctl(). I think we can
>> safely & without hacks defer a lot of that till the first pthread_create().
>>
>> Andrew
>>
>
> This libc-libpthread split/merge is a red herring.
>
> The problem here is with a mutual dependencies between POSIX threads
> library and malloc library.
>
> I did some investigation and here are my findings:
>
> 1. jemalloc abuses initialization and initializes self very early, with
> a constructor:
>
> /*
> * If an application creates a thread before doing any allocation in the
> main
> * thread, then calls fork(2) in the main thread followed by memory
> allocation
> * in the child process, a race can occur that results in deadlock
> within the
> * child: the main thread may have forked while the created thread had
> * partially initialized the allocator. Ordinarily jemalloc prevents
> * fork/malloc races via the following functions it registers during
> * initialization using pthread_atfork(), but of course that does no good if
> * the allocator isn't fully initialized at fork time. The following
> library
> * constructor is a partial solution to this problem. It may still be
> possible
> * to trigger the deadlock described above, but doing so would involve
> forking
> * via a library constructor that runs before jemalloc's runs.
> */
> #ifndef JEMALLOC_JET
> JEMALLOC_ATTR(constructor)
> static void
>
> jemalloc_constructor(void) {
> malloc_init();
> }
> #endif
>
> Relevant commit:
>
> commit 20f1fc95adb35ea63dc61f47f2b0ffbd37d39f32
> Author: Jason Evans <je%fb.com@localhost>
> Date: Tue Oct 9 14:46:22 2012 -0700
>
> Fix fork(2)-related deadlocks.
>
> Add a library constructor for jemalloc that initializes the allocator.
> This fixes a race that could occur if threads were created by the main
> thread prior to any memory allocation, followed by fork(2), and then
> memory allocation in the child process.
>
> Fix the prefork/postfork functions to acquire/release the ctl, prof, and
> rtree mutexes. This fixes various fork() child process deadlocks, but
> one possible deadlock remains (intentionally) unaddressed: prof
> backtracing can acquire runtime library mutexes, so deadlock is still
> possible if heap profiling is enabled during fork(). This deadlock is
> known to be a real issue in at least the case of libgcc-based
> backtracing.
>
> Reported by tfengjun.
>
> 2. FreeBSD added a hack and an internal pthread_mutex_init() version
> called: _pthread_mutex_init_calloc_cb().. it passes a callback pointer
> to jemalloc's tiny calloc().
>
> This is very ugly and I consider it as the wrong way of boostraping malloc.
>
> 3. There is a problem inside libpthread. It as designed to not malloc()
> early to not trigger malloc initialization, however it was broken as it
> calls at_fork functions:
>
> #0 malloc (size=size@entry=16) at
> /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2052
> #1 0x0000776b71d71a44 in af_alloc () at
> /usr/src/lib/libc/gen/pthread_atfork.c:80
> #2 af_alloc () at /usr/src/lib/libc/gen/pthread_atfork.c:74
> #3 _pthread_atfork (prepare=prepare@entry=0x0, parent=parent@entry=0x0,
> child=child@entry=0x776b7220b3e5 <pthread__fork_callback>)
> at /usr/src/lib/libc/gen/pthread_atfork.c:121
> #4 0x0000776b7220cacd in pthread__init () at
> /usr/src/lib/libpthread/pthread.c:260
> #5 0x0000776b71d7a585 in _libc_init () at
> /usr/src/lib/libc/misc/initfini.c:128
>
> These at_fork routines caused also issues with false-positives in Leak
> Sanitizer. I had to pacify the sanitizer and disable tracking of its
> allocations.
>
>
> This patch removes '#if 0' hack from src/lib/libpthread and switches
> at_fork to mmap()+munmap().
>
> http://netbsd.org/~kamil/patch-00219-libpthread-libc-jemalloc.txt
>
> This test disabled the constructor hack:
>
> http://netbsd.org/~kamil/patch-00220-jemalloc-disable-constructor.txt
>
> With these changes everything seems to work.
>
> In order to avoid the FreeBSD specific hack with the constructor and
> initialize jemalloc always during libc bootstrap I propose the following
> approach:
>
> - add __libc_malloc_init() call in _libc_init()
> - redirect __libc_malloc_init() to jemalloc__init() with jemalloc
> - otherwise redirect to an empty stub
>
> Here is a patch that does everything and works fine for me.
>
> http://netbsd.org/~kamil/patch-00222-jemalloc-enhancements.txt
>
> There are no longer jemalloc calls before being ready and jemalloc is
> still initialized always but in its proper time.
>
> (gdb) r
> The program being debugged has been started already.
> Start it from the beginning? (y or n) y
> Starting program: /tmp/a.out
>
> Breakpoint 2, jemalloc__init () at
> /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:3209
> 3209 malloc_init();
> (gdb) c
> Continuing.
>
> Breakpoint 3, malloc_init_hard () at
> /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1533
> 1533 malloc_init_hard(void) {
> (gdb) bt
> #0 malloc_init_hard () at
> /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1533
> #1 0x00007f7ff763f9c4 in ?? () from /usr/lib/libc.so.12
> #2 0x00007f7ff7ef9400 in ?? ()
> #3 0x00007f7ff763ac99 in _init () from /usr/lib/libc.so.12
> #4 0x0000000000000000 in ?? ()
>
Attachment:
signature.asc
Description: OpenPGP digital signature