Subject: Re: third results (was: ddb & shared libs)
To: John S. Dyson <toor@dyson.iquest.net>
From: Chuck Cranor <chuck@maria.wustl.edu>
List: tech-kern
Date: 04/03/1998 10:47:48
hi john -

>FreeBSD does not need to use or create three pages like the old VM code
>did, and hasn't for the last year++.  The source says it all!!!  

yes, i would have been surprised if FreeBSD still had that sort of
behavior.   in fact, i'm surprised that it has only been gone for a
"year++"?   didn't you fix it farther back than that?

but as much as i like your work (and i do!), i'm afraid that although
the source says it all, it does not say enough to make it easy to 
understand without a lot of study.   minimal comments.


just for fun, i looked a bit at the freebsd code and the netbsd code.
the netbsd collapse code supports two types of collapses:

these are shadow object chains:

 a -> b -> c    --- you can collapse "a" into "b" if b's reference count is 1

  "a" collapse into "b"

      a -> c



    x
     \
 a -> b -> c    --- you can have "a" bypass "b" if all the pages in "b"
                    are already in "a" ... this allows "x" to be collapsed
                    into "b"  later like this:


 "a" bypass "b"     "x" collapse into "b"

    x
     \
      b -> c     ===>   x -> c
          /                 /
         a                 a



now looking at the code, netbsd attempts either a bypass or a collapse
at the following times:
  [1] at a write fault (i assume this is mainly to catch the bypass
                        case since references shouldn't be changing 
                        that much?)
  [2] at vm_object_deallocate time (because you are changing reference
                        counts you should attempt a collapse)
  [3] vm_object_page_clean (not sure why)
  [4] vm_object_copy (at fork time)
  [5] vm_object_coalesce (at vm_map_insert -- mmap time)
  [6] at pageout time (try and collapse before making a pager)



now let's attempt to decipher the FreeBSD code:
 vm_object_collapse has a special case function vm_object_qcollapse
 [it isn't fully explained why] that does a plain object collapse
 that will "plug 99.9% of the rest of the leaks," what ever that means.

 in the non-vm_object_qcollapse case, the vm_object_collapse code goes
 on to check the reference count of the backing object and collapse
 if it is equal to one.    if the reference count is not equal to one
 then it attempts a bypass operation.


in both the netbsd and freebsd code there are a number of "special cases"
that cause the collapse operation to fail.


let's see where freebsd does collapses:
  [1] vm_map_copy_entry (at fork time) -- this is basically the same
        case as netbsd case [4] (vm_object_copy).   
  [2] vm_object_deallocate (because you are changing reference
                        counts you should attempt a collapse)
  [3] vm_object_coalesce 
  [4] at pageout time


so the freebsd and netbsd code is actually very similar, except
that freebsd does not collapse at write fault time or at 
vm_object_page_clean time:   
  - looking at the FreeBSD CVS logs, I see that the vm_object_collapse 
    in vm_fault was removed quite recently (1998/01/12) in rev 1.74.    
    the commit log does not explain way (at least it wasn't obvious to
    me)
  - the call in vm_object_page_clean was removed a long time ago in 
    rev 1.2 of vm_object.c

>Collapse operations happen very efficiently on FreeBSD, 

because you have elminated the vm_object_collapse calls in vm_fault
and vm_object_page_clean?


>and that has been preliminarily fixed for the last 4yrs, and fully 
>fixed for the last 3yrs.  

sorry, that doesn't parse.   if it was fully fixed for the last three
years then why do you say it was preliminarily fixed for the last four
years?   which is it?   also, rev 1.117 of vm_object.c (1998/03/08) claims
to have a fix for the collapse code, so it has most likely been
"fully fixed" for only the last month or so.  :-)



>Frankly, the use of the VM object inheritance scheme vs. other schemes 
>is a matter of opinion.  At the end of the day, the difference will be 
>negligible, because that isn't where the (performance or bug) problems 
>have been in FreeBSD for years.

hmmm.   well, in browsing the FreeBSD VM related CVS commit logs it
seems to me that a significant (and impressive!) amount of effort 
has gone into debugging and fixing object management code over the 
years.    and i've also seen both NetBSD and OpenBSD struggle with 
trying to sort out the object collapse issues properly.  it sure
seems painful to me.   if the difference in performance is going to
be negligible, then why suffer?




>There are superficial similarities between the MACH and the FreeBSD VM code,
>generalizing between the two is almost ludricrious nowadays.  

this is true, only if you are only considering the code from the 
performance angle.   if you look at the data structures and functions
that FreeBSD VM provides, you will find that there is still quite
a bit of similarities between the FreeBSD VM and the 4.4BSD VM.
i would say the same is true of UVM and BSD VM.


>The ONLY valid test for the code is to run the code under real-world 
>conditions, or with a properly modeled benchmark.  Lmbench and light 
>loaded latency tests are mostly only good for such benchmark loads.

true.  but again, you are only considering the performance angle.
that isn't what Jim is looking at.   he is trying to understand 
how the VM works (i.e. what code paths are taken) and what resources 
are used along the way.   he isn't measuring latency or anything like
that - that's a different topic than the one being discussed.


chuck