Subject: Re: memory (re-)allocation woes
To: NetBSD Kernel Technical Discussion List <tech-kern@NetBSD.ORG>
From: theo borm <theo4490@borm.org>
List: tech-kern
Date: 11/27/2004 21:55:55
dear Greg,

Thanks for the reply and for taking the time to check if you can
replicate my problems. I've added some more comments and
new issues below....

Greg A. Woods wrote:

>[ On Saturday, November 27, 2004 at 16:05:29 (+0100), theo borm wrote: ]
>  
>
>>Subject: memory (re-)allocation woes
>>
>>As a follow-up to my previous post, I checked if the same
>>problem (user program memory allocation freezing/rebooting
>>a 1.6.2 system) does occur in simpler set-ups. I ended up
>>adding local root and swap disks to some of the cluster
>>nodes and did fresh installs of 1.6.2 on those.
>>
>>I am sad to say that it does.
>>    
>>
>
>No problem for the slightly more portable (architecture-wise) version
>attached below on my AlphaServer 4000 (2x400MHz, 1.5GB RAM, 2G swap)
>
>14:32 [703] $ /sbin/swapctl -lk
>Device      1K-blocks     Used    Avail Capacity  Priority
>/dev/ld0b     2048256   452360  1595896    22%    0
>
>14:32 [704] $ ulimit -a
>time(cpu-seconds)    unlimited
>file(blocks)         unlimited
>coredump(blocks)     unlimited
>data(kbytes)         1048576
>stack(kbytes)        32768
>lockedmem(kbytes)    262144
>memory(kbytes)       2048000
>nofiles(descriptors) 13196
>processes            4116
>
>14:29 [702] $ ./trealloc 1048576 1048576 
>[[ .... ]]
>step 342: reallocating 359661568 bytes... trealloc: realloc() failed: Cannot allocate memory
>  
>
I note that the program could not (re-) allocate 342 Mbyte, which
is about a third of the memory available to user processes:

data(kbytes)         1048576


I've now also set this parameter to less than (mem+swap)
and now my little program at least *sometimes* exits gracefully....
If I run /two/ instances at once (with the same resource limit), then
one *still* gets killed; you may want to give that a try as well...
(there is no need to start allocating at 1Mbyte though; this will
only increase the time you have to wait)

This makes this configuration parameter of not much use,
even not to prevent user processes from being killed by
the kernel for (naively) allocating more than a third of
the memory available to them.... (in this case on a
512+512Mbyte system with a 512 MB data limit it got killed
allocating ~170Mbytes :-( )

>
>The system got a little sluggish, and my emacs processes got swapped out
>because I wasn't using them (I was just playing Spider to keep a small
>program running to get a good feel for system response :-), but
>otherwise it just chugged along fine.
>  
>
You still had 512MB of physical memory to play with. In my case "sluggish"
was not the word. I waited for more than 8 hours for the system to come back
up, and it didn't. I've also had sudden reboots because of this.

>It was interesting to watch it in top as it grew and shrunk and then
>grew larger again with each step.  :-)
>
>  
>
I have a graph, and looks to me as if reallocs result in pages being held
back "somewhere" after a free, unless the system is starved, and only then
is the memory actually returned to the free pool.

I've also (cursorily) looked at the libc realloc implementation, and as far
as I can tell there is fairly little intelligence. A realloc (mostly) 
results
in three steps:
a) allocation of new memory
b) copy of memory contents
c) freeing of old memory

This would explain reallocs being limited to allocating half the datasize
limit, but does not explain why it stops at a third.

>I think the problem might be that the default rlimits are too large for
>the resources available on the systems you're using.  On BSD systems
>which over-commit memory resources you cannot allow user processes to
>get out of hand and soak up too much of the system's resources.
>  
>
well, setting the resource limits lower may help, but will not prevent
processes naively allocating memomry from being killed.

>Note how the program failed on my system when it did reach the maximum
>data size limit imposed by my rlimits.
>  
>
nope. It just exceeded 1/3 of the limit.

>(that the program thought it was only trying to use 702464 kb at the
>  
>
Could you please explain where that number comes from?

>time only reveals, I think, the wastage in our malloc() implementation)
>
>  
>
I've looked at it, and /hope/ that someone will find the time to come
up with something smarter and more efficient; I guess that some
processors can move memory around in user process address space
through remapping it in their mmu....
I'm not quite sure if /I/ am equiped to do so, and I'm not sure either
how portable this would be; it might mean a different malloc
implementation for different architectures, possibly moving its
implementation from libc to kernel, leaving only stubs in libc.
(? would that be a good idea at all ? )

As a side note:
I've also taken the time to install 1.6 on a system, and curiously enough
the sytem /does/ get sluggish, and the program /does/ get killed, but it
does not grind to a (virtual or real) halt, and I've not managed to crash
it either. I'll give 2.0rc5 a try later...

with kind regards,

Theo.