Subject: Re: further vm adventures: Ah, found!
To: None <jiho@postal.c-zone.net>
From: Chuck Cranor <chuck@dworkin.wustl.edu>
List: tech-kern
Date: 04/27/1998 23:06:26
hi-

>Chuck and ddb have saved the day again.

excellent!


>Consider an image which, at 350x234 in 16-bit color, occupies 160K.  When the X
>server allocates this, it winds up with two map entries backed by two separate
>objects.
>The address range for each map entry only covers 1 page, but the object backing
>each entry has 40 pages, all of which are resident.
>Looking at the objects reveals that, indeed, all 80 pages (320K) are there, and
>the pages of the two objects seem to be interleaved, or alternating.

>What's more, these objects DO NOT go away when the client that sent the data to
>the server exits.  I'm not sure if that's because the malloc thinks it's only
>caching 8K (two pages), or because the server just hangs onto them.  Either
>way, it explains why only 8K turns up in the server's RSS:  that's all its map
>entries show.  The active list, naturally, shows all 320K.

>Which is why I STILL say the old Mach vm seems to have a problem here.  I can't
>see how you get 1-page map entries for 40-page objects.  If the server
>requested two 40-page allocations, the map entries should reflect that.  If the
>server somehow requested the second allocation overlapping all but the first
>page of the first allocation, what should have happened to the 39 overlapped
>pages in the first vm object?!?


can you post the details of what happens to the parts of the X server's
map that happen between exit-exec-exit cycle?   i'm only interested in the 
virtual addresses and object pointers that have changed between runs.


here is my guess as to what is happening, maybe you can confirm or 
deny it:

    in BSD/Mach VM when you mmap() an area of anonymous memory area 
    it creates a zero fill vm_object with a reference count of 1 and 
    maps it into the process' address space.

    if you munmap() part, but not _all_ of the mapping, the memory in
    the unmapped region is not actually freed (kind of like what you
    are seeing with the active list).   this is because the reference
    count on the backing object is still greater than zero, so the object
    isn't terminated (even though it is now larger than needed).   this 
    can create a "swap memory leak effect" if you use a certain pattern 
    of mmap/munmap's... I am thinking that maybe the X server is doing 
    this.


    one thing you should also note is that sometimes sys_mmap() will
    grow one of these objects rather than mmaping a new one.   so if
    you do two MAP_ANON mmap's it may result in only one map entry
    (look for "See if we can avoid creating a new entry by extending 
     one of our neighbors." in vm_map_insert() in vm_map.c).


here is an example program that will tickle the problem:
/* punmap.c: partial memory unmap ... check for swap leak */

#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

void doit(str) 
char *str;
{
  printf("%s: ", str);
  fflush(stdout);
  system("vmstat -s | grep 'pages active'");
}

main() {
  void *ptr;
  int pgsz = getpagesize();

  doit("START");
  ptr = mmap(NULL, 100 * pgsz, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, 
                -1, 0);
  if (ptr == (void *)-1) err(1, "mmap");
  bzero(ptr, 100*pgsz);         /* fault in pages */
  doit("AFTER MMAP 100 PAGES");
  if (munmap(ptr + pgsz, 99*pgsz) == -1) err(1, "munmap");
  doit("AFTER MUNMAP 99 of 100 PAGES");
  if (munmap(ptr, pgsz) == -1) err(1, "munmap");
  doit("AFTER MUNMAP OF FINAL PAGE");
} 


the result of this program with the NetBSD 1.3 Mach VM iS:
START:      4019 pages active
AFTER MMAP 100 PAGES:      4119 pages active
AFTER MUNMAP 99 of 100 PAGES:      4119 pages active
AFTER MUNMAP OF FINAL PAGE:      4019 pages active


note that I have addressed this issue with UVM, so UVM will report:
START:       676 pages active
AFTER MMAP 100 PAGES:       776 pages active
AFTER MUNMAP 99 of 100 PAGES:       677 pages active
AFTER MUNMAP OF FINAL PAGE:       676 pages active



this is basically a result of the Mach vm object design.   memory objects
(files, devices, and in this case anonymous memory) have no idea how
much of their range is mapped and how much us unmapped.   for files
this isn't a problem because we can just write out (or toss) unused pages
to backing store and be done with it.   but anonymous memory is backed
by swap, thus we have to write it out to swap.    if you keep doing this
sort of thing swap will eventually fill up.


if this is indeed the problem, i'd like to use it as a practical example
for my writeup on the uvm stuff.

chuck