tech-kern: Re: memory (re-)allocation woes

Subject: Re: memory (re-)allocation woes
To: NetBSD Kernel Technical Discussion List <tech-kern@NetBSD.ORG>
From: theo borm <theo4490@borm.org>
List: tech-kern
Date: 11/30/2004 11:06:54
Joerg Sonnenberger wrote:

>On Mon, Nov 29, 2004 at 01:15:43AM +0100, theo borm wrote:
>[skip troll-like ranting over flaws of NetBSD]
>  
>
>>well... I'll start thinking about how to make reallocs less wastefull of
>>resources; in view of its current implementation performance (memcpy,
>>eventually leading to excessive swapping) can hardly be considered an
>>issue..... I guess I have some serious reading up to do. Can anyone give me
>>a hint if there is a portable (architecture wise) way to remap pages from a
>>user program?
>>    
>>
>
>There is no portable way to remap pages created via MAP_ANON. That is the
>fundamental reason why realloc can't just shove the pages around and be
>done. Basically, it does implement what you call "an algorithm without
>support from the hardware VM".
>
>Joerg
>  
>
Even without support for /remapping/, using mmap in [m/re]alloc may
actually be a good way to limit resource wastage and improve performance
of reallocs:

As I read the malloc code (but please keep in mind it was a very quick 
glance)
mmap is only used to allocate the page directory, whereas the actual 
storage is
maintained using sbrk.

The page directory /can/ be extended when nescessary, but this means mmaping
a new one, copying the old one and then munmapping the old one. So for the
page directory the (momentary) resource "wastage" would be the size of the
old page directory.

Because sbrk defines a single contiguous area of memory, chunks too small
to hold the new reallocation size will never become available, leading to
more extreme resource wastage (as seen): in the reallocation test program:
 (one + is  one allocated MB, one - is one orphaned MB)

+
-++
---+++
------++++
+++++
....

[338 * '+']
[338 * '-'][339 * '+']
[338 * '-'][339 * '-'][340 * '+']
[341 * '+']

etcetera.

There are malloc/realloc combinations conceivable that are even
*more pathological, possibly even so pathological that none of the
blocks will *ever* be reused or freed, which would lead to the realloc
test program requiring up to (M+I*(N*(N+1)/2)) bytes in step N, with
M the initial allocation and I the increase. For instance with M=I=1MB,
at step 340 such a pathological program would require ~57 GByte (...!)

To make a long story short: the current realloc can lead to unbounded
resource wastage: In amount of allocated storage, in time spent in
swapping the newly touched pages in and out of core /and/ in exhaustion
of address space (hey, 64 bit addressing may be usefull after all).

I see no reason (but then again, I'm not aware of all the subtilities
that have influenced the current implementation) why the storage area
itself could not also be managed using mmap.

A /mildly/ resource unfriendly realloc implementation would probably
do something like this:

newaddress=mmap(0,newsize,PROT_READ|PROT_WRITE,\
            MAP_ANON,-1,0);
if (newadress!=MAP_FAILED){
    memcpy(newaddress,oldaddress,oldsize);
    munmap(oldaddress,oldsize)
    return(newaddress);
}
else return(NULL);

Of course with some additional bookkeeping around it to keep track of
memory. A slightly less resource unfriendly mmap may first try to
grow the region, and only if that fails do the above. Even the above
would have the favourable property that maximum resource wastage would
be /limited/ (to the size of the old allocation). (the address space
and time wastage may still occur)

The best implementation would use the new mremap function,
guaranteeing that, when reallocation is at all possible (suitably large
piece of the address space is available), no memory needs to be physically
copied, thereby assuring that the /realloc/ will not reference any (swapped
out) pages ;-)

So, before I embark on a pointless journey, are there any compelling
reasons why I should not reimplement malloc using mmap instead of
sbrk? Of course using sbrk guarantees that memory is in one contiguous
block, assuring that a dump of this region includes all the programs'
"allocated" storage (as well as some unfreeable storage). The malloc
meta data is however already mmaped, so the usefullness of the data
dump without this is argueable.

Cheers, Theo.