NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/43409: jemalloc x (threads + rlimit) = perpetual ENOMEM
>Number: 43409
>Category: bin
>Synopsis: jemalloc x (threads + rlimit) = perpetual ENOMEM
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Jun 03 10:35:00 +0000 2010
>Originator: Antti Kantee
>Release: 5.0
>Organization:
>Environment:
i386
>Description:
Under some condition(s) a multithreaded program can trigger perpetual
ENOMEM from posix_memalign() (and probably malloc too). The problem
persists even if the program releases a lot more memory than it tries
to allocate.
(the following analysis is tied to the program in "how-to-repeat")
If we run the program with MALLOC_OPTIONS U, we first see the following
backend allocation failure:
9283 1 a.out CALL mmap(0,0x100000,3,0x14001002,0xffffffff,0,0,0)
9283 1 a.out RET mmap -1 errno 12 Cannot allocate memory
9283 1 a.out CALL break(0x9000000)
9283 1 a.out RET break 0
9283 1 a.out CALL utrace(0xbbbb4c51,0xbfbfdda0,0xc)
9283 1 a.out MISC malloc: 12, 00000000001000000010f008
9283 1 a.out RET utrace 0
mmap(MAP_ANON) failed, but since break() was still successful, the
allocation could be carried out. Then:
9283 1 a.out CALL mmap(0,0x100000,3,0x14001002,0xffffffff,0,0,0)
9283 1 a.out RET mmap -1 errno 12 Cannot allocate memory
9283 1 a.out CALL break(0x9100000)
9283 1 a.out RET break -1 errno 12 Cannot allocate memory
9283 1 a.out CALL utrace(0xbbbb4c51,0xbfbfdda0,0xc)
9283 1 a.out MISC malloc: 12, 000000000010000000000000
9283 1 a.out RET utrace 0
Now break() fails too and allocation fails. This causes our
program to release the "emergency" memory:
9283 1 a.out CALL utrace(0xbbbb4c51,0xbfbfdda4,0xc)
9283 1 a.out MISC malloc: 12, 005090bb0000000000000000
9283 1 a.out RET utrace 0
9283 1 a.out CALL utrace(0xbbbb4c51,0xbfbfdda4,0xc)
9283 1 a.out MISC malloc: 12, 006090bb0000000000000000
9283 1 a.out RET utrace 0
[.....]
And retry allocation:
9283 1 a.out CALL mmap(0,0x100000,3,0x14001002,0xffffffff,0,0,0)
9283 1 a.out RET mmap -1 errno 12 Cannot allocate memory
9283 1 a.out CALL utrace(0xbbbb4c51,0xbfbfdda0,0xc)
9283 1 a.out MISC malloc: 12, 000000000010000000000000
9283 1 a.out RET utrace 0
However, memory is not allocated from the recently freed fragments
(even though they are the same size as what we are trying to allocate),
but rather more memory is requested from the backend. Since none has
been freed to the backend, this request fails. Further requests
to malloc anything will fail as well.
Note that we freed the memory from the same thread we are attempting
to reallocate it from. ktrace shows no other calls to malloc between
free() and the next call to posix_memalign(). That makes me
unsure of if this is really a malloc problem or something else.
Also, ideally other threads would be able to steal/use memory from
other arenas if backend memory has been exhausted. I didn't read
the code closely enough to see if this is supported.
>How-To-Repeat:
Run the following program:
=== snip ===
#include <sys/types.h>
#include <sys/sysctl.h>
#include <sys/mman.h>
#include <kvm.h>
#include <limits.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
static void *
mythread(void *arg)
{
void *v;
for (;;) {
if (posix_memalign(&v, 4096, 4096) == 0)
free(v);
}
}
int
main(int argc, char *argv[])
{
char buf[_POSIX2_LINE_MAX];
struct kinfo_proc2 *kp;
kvm_t *kd;
struct rlimit rl;
pthread_t pt;
int cnt;
/* scope out current size, give us 16megs more */
#define MOOOORE 16*1024*1024
kd = kvm_openfiles(NULL, NULL, NULL, KVM_NO_FILES, buf);
if (kd == NULL)
err(1, "kvm_openfiles: %s", buf);
kp = kvm_getproc2(kd, KERN_PROC_PID, getpid(), sizeof(*kp), &cnt);
if (kp == NULL)
err(1, "kvm_getprocs: %s", kvm_geterr(kd));
if (getrlimit(RLIMIT_AS, &rl) == -1)
err(1, "getrlimit");
rl.rlim_cur = kp->p_vm_vsize + MOOOORE;
if (setrlimit(RLIMIT_AS, &rl) == -1)
err(1, "setrlimit");
if (getrlimit(RLIMIT_DATA, &rl) == -1)
err(1, "getrlimit");
rl.rlim_cur = rl.rlim_max = MOOOORE;
if (setrlimit(RLIMIT_DATA, &rl) == -1)
err(1, "setrlimit");
pthread_create(&pt, NULL, mythread, NULL);
{
void *store[8];
void *v;
int i;
for (i = 0; i < 8; i++)
posix_memalign(&store[i], 4096, 4096);
while (posix_memalign(&v, 4096, 4096) == 0)
continue;
for (i = 0; i < 8; i++)
free(store[i]);
if (posix_memalign(&v, 4096, 4096) != 0)
err(1, "fail");
}
}
=== snip ===
Note that the condition triggers quite rarely:
pain-rustique:186:~> repeat 500 ./a.out
a.out: fail: Cannot allocate memory
a.out: fail: Cannot allocate memory
a.out: fail: Cannot allocate memory
a.out: fail: Cannot allocate memory
pain-rustique:187:~>
i.e. 496/500 times it did not show up. I could not trigger the
problem without the "helper" thread.
>Fix:
currently unknown
Home |
Main Index |
Thread Index |
Old Index