NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: track down why malloc fails



In article <20080626103552.57c26453.jklowden%schemamania.org@localhost>,
James K. Lowden <netbsd-users%NetBSD.org@localhost> wrote:
>Christos Zoulas wrote:
>> In article <20080625145818.GA29327%oak.schemamania.org@localhost>,

>Is there nothing similar to time(1) for memory?  I would love to know 1)
>the high-water mark for a process after it runs and 2) the amount of
>memory in use and the amount requested requested at the time malloc
>failed.  (Details on why it failed would be nice, too.) 
>
>Regarding ulimit, I put "ulimit -a" in the script and got this output:
>
>time(seconds)        unlimited
>file(blocks)         unlimited
>data(kbytes)         131072
>stack(kbytes)        2048
>coredump(blocks)     unlimited
>memory(kbytes)       245740
>locked memory(kbytes) 81913
>process(processes)   160
>nofiles(descriptors) 64

try:
ulimit -S -m $(ulimit -m -H)
>
>That tells me the process can allocate 256 MB of memory.  That's plenty. 
>For one thing, my own ulimit is the same.  For another, we're only doing
>"svn commit".  Finally, the machine has only 256 MB RAM.  If the process
>came anywhere close to allocating all physcical memory, the other
>processes would all swap out and the machine would crawl.  
>
>So I have enough memory, and a bug.  But it's pernicious, because it crops
>up in different places (if gdb bt is to be believed) and is repeatable
>only in the dark of night.  
>
>The ktrace.out was 757,271,470 bytes.  What's odd is that when I re-ran
>the job this morning (under my account) the ktrace.out is 10% of that
>size: 85,489,521 bytes.  
>
>Last night's ktrace fingers mmap(2) as the culprit.  I don't know how to
>read the dump, though.  Here's the end of it:
>
>        K 25
>        svn:wc:ra_dav:version-url
>        V 109
>        /svn/Prod"
> 28999 svn      GIO   fd 4 read 8 bytes
>       "uctDevel"
> 28999 svn      RET   read 4096/0x1000
> 28999 svn      CALL  break(0x10059000)
> 28999 svn      RET   break 0
> 28999 svn      CALL  break(0x1005b000)
> 28999 svn      RET   break 0
> 28999 svn      CALL  mmap(0,0x21000,3,0x1002,0xffffffff,0,0,0)
>                                       ^^^^^^
>                                       [MAP_PRIVATE, MAP_ANON]
> 28999 svn      RET   mmap -1 errno 12 Cannot allocate memory
>
>The mmap man page says mmap has 6 arguments.  The kdump output above shows
>8.  

because kdump does not know about 64 bit args.

>This looks to me like we're trying to allocate 135,168 bytes at an address
>of mmap's choosing.  Is that what it says?  The fd argument appears to be
>-1, but in that case (according to the docs) mmap should return EBADF or
>perhaps EACCES.  
>We're running with MALLOC_OPTIONS=X here, so the end of the story arrives
>quickly: 
>
> 28999 svn      CALL  break(0x10059000)
> 28999 svn      RET   break 0
> 28999 svn      CALL  write(2,0xbfbffb0c,3)
> 28999 svn      GIO   fd 2 wrote 3 bytes
>       "svn"
> 28999 svn      RET   write 3
> 28999 svn      CALL  write(2,0x4840a4fa,0xd)
> 28999 svn      GIO   fd 2 wrote 13 bytes
>       " in malloc():"
> 28999 svn      RET   write 13/0xd
> 28999 svn      CALL  write(2,0x4840a444,8)
> 28999 svn      GIO   fd 2 wrote 8 bytes
>       " error: "
> 28999 svn      RET   write 8
> 28999 svn      CALL  write(2,0x4840a508,0xf)
> 28999 svn      GIO   fd 2 wrote 15 bytes
>       "out of memory.
>       "
> 28999 svn      RET   write 15/0xf
> 28999 svn      CALL  __sigprocmask14(3,0x482055c4,0)
> 28999 svn      RET   __sigprocmask14 0
> 28999 svn      CALL  getpid
> 28999 svn      RET   getpid 28999/0x7147, 15783/0x3da7
> 28999 svn      CALL  kill(0x7147, SIGABRT)
> 28999 svn      RET   kill 0
> 28999 svn      PSIG  SIGABRT SIG_DFL
> 28999 svn      NAMI  "svn.core"
>
>What to do?  Suggestions only welcome.  
>

try the ulimit command. Also what does ulimit -H -a and ulimit -S -a print?

christos



Home | Main Index | Thread Index | Old Index