Port-pmax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems updating to new libc on pmax



2010/4/28 Eric Haszlakiewicz <erh%nimenees.com@localhost>:
> On Tue, Apr 27, 2010 at 09:26:49PM +0200, Erik Bertelsen wrote:
>> 2010/4/27 Matthias Drochner <M.Drochner%fz-juelich.de@localhost>:
>> > bertelsen.erik%gmail.com@localhost said:
>> >> [1] ? Segmentation fault (core dumped) cat >&2 <<...
>> >
>> > can you make sense of the core dumps?
>>

Hello again,

I've done some further experiments in order to try to pin-point the
problem that I have with recent libc on pmax.

I tried the following patch as a poor man's debugger:


===================================================================
RCS file: /cvsroot/src/lib/libc/stdlib/jemalloc.c,v
retrieving revision 1.21
diff -c -r1.21 jemalloc.c
*** jemalloc.c  4 Mar 2010 22:48:31 -0000       1.21
--- jemalloc.c  9 May 2010 07:35:02 -0000
***************
*** 3574,3581 ****
--- 3574,3584 ----
        malloc_mutex_init(&chunks_mtx);
        RB_INIT(&huge);
  #ifdef USE_BRK
+                       readlink("/etc/brk.conf", buf, sizeof(buf) - 1);
        malloc_mutex_init(&brk_mtx);
+                       readlink("/etc/brk1.conf", buf, sizeof(buf) - 1);
        brk_base = sbrk(0);
+                       readlink("/etc/brk2.conf", buf, sizeof(buf) - 1);
        brk_prev = brk_base;
        brk_max = brk_base;
  #endif


With this patch installed, the complete ktruss output of a plain ls command is:

# /rescue/mv /tmp/libc.so.12.172 .
# ls
Memory fault (core dumped)
# ktruss ls
  8077      1 ktruss   fktrace                     = 0, 2112355200
  8077      1 ktruss   emul(netbsd)
  8077      1 ktruss   fcntl(0x4, 0x3, 0)          = 1, 2112355200
  8077      1 ktruss   fcntl(0x4, 0x4, 0x1)        = 0, 2112355200
  8077      1 ls       execve("/bin/ls", 0x7fffdd08, 0x7fffdd10) JUSTRETURN
  8077      1 ls       emul(netbsd)
  8077      1 ls       mmap(0, 0x8000, 0x3, 0x1002, 0xffffffff, 0, 0,
0) = 0x7dff7000
  8077      1 ls       open("/etc/ld.so.conf", 0, 0x7dff0cc0) Err#2 ENOENT
  8077      1 ls       open("/lib/libutil.so.7", 0, 0) = 3, 2147472849
  8077      1 ls       __fstat50(0x3, 0x7fffd500)  = 0, 2147472849
  8077      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0x7dff6000
  8077      1 ls       munmap(0x7dff6000, 0x1000)  = 0
  8077      1 ls       mmap(0, 0x28000, 0x5, 0x10000002, 0x3, 0, 0, 0)
= 0x7dfb0000
  8077      1 ls       mmap(0x7dfd5000, 0x2000, 0x3, 0x12, 0x3, 0,
0x15000, 0) = 0x7dfd5000
  8077      1 ls       mmap(0x7dfd7000, 0x1000, 0x3, 0x1012,
0xffffffff, 0, 0, 0) = 0x7dfd7000
  8077      1 ls       mprotect(0x7dfc5000, 0x10000, 0) = 0, -4096
  8077      1 ls       close(0x3)                  = 0
  8077      1 ls       open("/lib/libc.so.12", 0, 0) = 3, 2147472849
  8077      1 ls       __fstat50(0x3, 0x7fffd500)  = 0, 2147472849
  8077      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0x7dff6000
  8077      1 ls       munmap(0x7dff6000, 0x1000)  = 0
  8077      1 ls       mmap(0, 0x14c000, 0x5, 0x10000002, 0x3, 0, 0,
0) = 0x7de60000
  8077      1 ls       mmap(0x7df95000, 0x8000, 0x3, 0x12, 0x3, 0,
0x125000, 0) = 0x7df95000
  8077      1 ls       mmap(0x7df9d000, 0xf000, 0x3, 0x1012,
0xffffffff, 0, 0, 0) = 0x7df9d000
  8077      1 ls       mprotect(0x7df86000, 0xf000, 0) = 0, -4096
  8077      1 ls       close(0x3)                  = 0
  8077      1 ls       __sysctl(0x7fffdbec, 0x2, 0x7dfa9fd0,
0x7fffdbe8, 0, 0) = 0, 81
  8077      1 ls       __sysctl(0x7fffdbf8, 0x2, 0x7fffdbf0,
0x7fffdbf4, 0, 0) = 0, 6
  8077      1 ls       rasctl(0x7defeb90, 0x14, 0) = 0, -1
  8077      1 ls       issetugid()                 = 0, 2113498716
  8077      1 ls       __sysctl(0x7fffc640, 0x2, 0x7dfa48a0,
0x7fffc63c, 0, 0) = 0, 6
  8077      1 ls       __sysctl(0x7fffc564, 0x2, 0x7dfab280,
0x7fffc560, 0, 0) = 0, 6
  8077      1 ls       readlink("/etc/malloc.conf", 0x7fffc654, 0x400) = 1, 1
  8077      1 ls       readlink("/etc/brk.conf", 0x7fffc654, 0x400) Err#2 ENOENT
  8077      1 ls       readlink("/etc/brk1.conf", 0x7fffc654, 0x400)
Err#2 ENOENT
  8077      1 ls       break(0x4167d0)             = 0, 4286416
  8077      1 ls       SIGSEGV SIG_DFL
#

This seems to indicate that it is the 'brk_base = sbrk(0);' statement
that fails.

Another observation: not all programs fail in this way, e.g. with the
faulty libc install, the ftp command can still be used. Actually
random tests indicate that (many or all?) applications in /bin fail
like ls above, while (many or all?) applications in /usr/bin can still
be used. I can't really believe that the /bin/ vs. /usr/bin/ path is
significant, however...

Another observation: reverting the latest update to jemalloc.c by
reverting to version 1.20 does not make any difference.

kind regards
Erik


Home | Main Index | Thread Index | Old Index