Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/uebayasi-xip]: src/sys/uvm uvmfault_promote: For promotion from a "lower...



details:   https://anonhg.NetBSD.org/src/rev/356799bfc118
branches:  uebayasi-xip
changeset: 751576:356799bfc118
user:      uebayasi <uebayasi%NetBSD.org@localhost>
date:      Fri Feb 12 16:06:50 2010 +0000

description:
uvmfault_promote: For promotion from a "lower" page, pass the belonging struct
uvm_object * from callers, because device page struct vm_page * doesn't have
a back-pointer to the uvm_object.

diffstat:

 sys/uvm/uvm_fault.c |  2284 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 2284 insertions(+), 0 deletions(-)

diffs (truncated from 2288 to 300 lines):

diff -r f4e55e886893 -r 356799bfc118 sys/uvm/uvm_fault.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/sys/uvm/uvm_fault.c       Fri Feb 12 16:06:50 2010 +0000
@@ -0,0 +1,2284 @@
+/*     $NetBSD: uvm_fault.c,v 1.166.2.2 2010/02/12 16:06:50 uebayasi Exp $     */
+
+/*
+ *
+ * Copyright (c) 1997 Charles D. Cranor and Washington University.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. All advertising materials mentioning features or use of this software
+ *    must display the following acknowledgement:
+ *      This product includes software developed by Charles D. Cranor and
+ *      Washington University.
+ * 4. The name of the author may not be used to endorse or promote products
+ *    derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * from: Id: uvm_fault.c,v 1.1.2.23 1998/02/06 05:29:05 chs Exp
+ */
+
+/*
+ * uvm_fault.c: fault handler
+ */
+
+#include <sys/cdefs.h>
+__KERNEL_RCSID(0, "$NetBSD: uvm_fault.c,v 1.166.2.2 2010/02/12 16:06:50 uebayasi Exp $");
+
+#include "opt_uvmhist.h"
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/kernel.h>
+#include <sys/proc.h>
+#include <sys/malloc.h>
+#include <sys/mman.h>
+
+#include <uvm/uvm.h>
+
+/*
+ *
+ * a word on page faults:
+ *
+ * types of page faults we handle:
+ *
+ * CASE 1: upper layer faults                   CASE 2: lower layer faults
+ *
+ *    CASE 1A         CASE 1B                  CASE 2A        CASE 2B
+ *    read/write1     write>1                  read/write   +-cow_write/zero
+ *         |             |                         |        |
+ *      +--|--+       +--|--+     +-----+       +  |  +     | +-----+
+ * amap |  V  |       |  ---------> new |          |        | |  ^  |
+ *      +-----+       +-----+     +-----+       +  |  +     | +--|--+
+ *                                                 |        |    |
+ *      +-----+       +-----+                   +--|--+     | +--|--+
+ * uobj | d/c |       | d/c |                   |  V  |     +----+  |
+ *      +-----+       +-----+                   +-----+       +-----+
+ *
+ * d/c = don't care
+ *
+ *   case [0]: layerless fault
+ *     no amap or uobj is present.   this is an error.
+ *
+ *   case [1]: upper layer fault [anon active]
+ *     1A: [read] or [write with anon->an_ref == 1]
+ *             I/O takes place in upper level anon and uobj is not touched.
+ *     1B: [write with anon->an_ref > 1]
+ *             new anon is alloc'd and data is copied off ["COW"]
+ *
+ *   case [2]: lower layer fault [uobj]
+ *     2A: [read on non-NULL uobj] or [write to non-copy_on_write area]
+ *             I/O takes place directly in object.
+ *     2B: [write to copy_on_write] or [read on NULL uobj]
+ *             data is "promoted" from uobj to a new anon.
+ *             if uobj is null, then we zero fill.
+ *
+ * we follow the standard UVM locking protocol ordering:
+ *
+ * MAPS => AMAP => UOBJ => ANON => PAGE QUEUES (PQ)
+ * we hold a PG_BUSY page if we unlock for I/O
+ *
+ *
+ * the code is structured as follows:
+ *
+ *     - init the "IN" params in the ufi structure
+ *   ReFault:
+ *     - do lookups [locks maps], check protection, handle needs_copy
+ *     - check for case 0 fault (error)
+ *     - establish "range" of fault
+ *     - if we have an amap lock it and extract the anons
+ *     - if sequential advice deactivate pages behind us
+ *     - at the same time check pmap for unmapped areas and anon for pages
+ *      that we could map in (and do map it if found)
+ *     - check object for resident pages that we could map in
+ *     - if (case 2) goto Case2
+ *     - >>> handle case 1
+ *           - ensure source anon is resident in RAM
+ *           - if case 1B alloc new anon and copy from source
+ *           - map the correct page in
+ *   Case2:
+ *     - >>> handle case 2
+ *           - ensure source page is resident (if uobj)
+ *           - if case 2B alloc new anon and copy from source (could be zero
+ *             fill if uobj == NULL)
+ *           - map the correct page in
+ *     - done!
+ *
+ * note on paging:
+ *   if we have to do I/O we place a PG_BUSY page in the correct object,
+ * unlock everything, and do the I/O.   when I/O is done we must reverify
+ * the state of the world before assuming that our data structures are
+ * valid.   [because mappings could change while the map is unlocked]
+ *
+ *  alternative 1: unbusy the page in question and restart the page fault
+ *    from the top (ReFault).   this is easy but does not take advantage
+ *    of the information that we already have from our previous lookup,
+ *    although it is possible that the "hints" in the vm_map will help here.
+ *
+ * alternative 2: the system already keeps track of a "version" number of
+ *    a map.   [i.e. every time you write-lock a map (e.g. to change a
+ *    mapping) you bump the version number up by one...]   so, we can save
+ *    the version number of the map before we release the lock and start I/O.
+ *    then when I/O is done we can relock and check the version numbers
+ *    to see if anything changed.    this might save us some over 1 because
+ *    we don't have to unbusy the page and may be less compares(?).
+ *
+ * alternative 3: put in backpointers or a way to "hold" part of a map
+ *    in place while I/O is in progress.   this could be complex to
+ *    implement (especially with structures like amap that can be referenced
+ *    by multiple map entries, and figuring out what should wait could be
+ *    complex as well...).
+ *
+ * we use alternative 2.  given that we are multi-threaded now we may want
+ * to reconsider the choice.
+ */
+
+/*
+ * local data structures
+ */
+
+struct uvm_advice {
+       int advice;
+       int nback;
+       int nforw;
+};
+
+/*
+ * page range array:
+ * note: index in array must match "advice" value
+ * XXX: borrowed numbers from freebsd.   do they work well for us?
+ */
+
+static const struct uvm_advice uvmadvice[] = {
+       { MADV_NORMAL, 3, 4 },
+       { MADV_RANDOM, 0, 0 },
+       { MADV_SEQUENTIAL, 8, 7},
+};
+
+#define UVM_MAXRANGE 16        /* must be MAX() of nback+nforw+1 */
+
+/*
+ * private prototypes
+ */
+
+/*
+ * inline functions
+ */
+
+/*
+ * uvmfault_anonflush: try and deactivate pages in specified anons
+ *
+ * => does not have to deactivate page if it is busy
+ */
+
+static inline void
+uvmfault_anonflush(struct vm_anon **anons, int n)
+{
+       int lcv;
+       struct vm_page *pg;
+
+       for (lcv = 0; lcv < n; lcv++) {
+               if (anons[lcv] == NULL)
+                       continue;
+               mutex_enter(&anons[lcv]->an_lock);
+               pg = anons[lcv]->an_page;
+               if (pg && (pg->flags & PG_BUSY) == 0) {
+                       mutex_enter(&uvm_pageqlock);
+                       if (pg->wire_count == 0) {
+                               uvm_pagedeactivate(pg);
+                       }
+                       mutex_exit(&uvm_pageqlock);
+               }
+               mutex_exit(&anons[lcv]->an_lock);
+       }
+}
+
+/*
+ * normal functions
+ */
+
+/*
+ * uvmfault_amapcopy: clear "needs_copy" in a map.
+ *
+ * => called with VM data structures unlocked (usually, see below)
+ * => we get a write lock on the maps and clear needs_copy for a VA
+ * => if we are out of RAM we sleep (waiting for more)
+ */
+
+static void
+uvmfault_amapcopy(struct uvm_faultinfo *ufi)
+{
+       for (;;) {
+
+               /*
+                * no mapping?  give up.
+                */
+
+               if (uvmfault_lookup(ufi, true) == false)
+                       return;
+
+               /*
+                * copy if needed.
+                */
+
+               if (UVM_ET_ISNEEDSCOPY(ufi->entry))
+                       amap_copy(ufi->map, ufi->entry, AMAP_COPY_NOWAIT,
+                               ufi->orig_rvaddr, ufi->orig_rvaddr + 1);
+
+               /*
+                * didn't work?  must be out of RAM.   unlock and sleep.
+                */
+
+               if (UVM_ET_ISNEEDSCOPY(ufi->entry)) {
+                       uvmfault_unlockmaps(ufi, true);
+                       uvm_wait("fltamapcopy");
+                       continue;
+               }
+
+               /*
+                * got it!   unlock and return.
+                */
+
+               uvmfault_unlockmaps(ufi, true);
+               return;
+       }
+       /*NOTREACHED*/
+}
+
+/*
+ * uvmfault_anonget: get data in an anon into a non-busy, non-released
+ * page in that anon.
+ *
+ * => maps, amap, and anon locked by caller.
+ * => if we fail (result != 0) we unlock everything.
+ * => if we are successful, we return with everything still locked.
+ * => we don't move the page on the queues [gets moved later]
+ * => if we allocate a new page [we_own], it gets put on the queues.
+ *    either way, the result is that the page is on the queues at return time
+ * => for pages which are on loan from a uvm_object (and thus are not
+ *    owned by the anon): if successful, we return with the owning object
+ *    locked.   the caller must unlock this object when it unlocks everything
+ *    else.
+ */
+
+int
+uvmfault_anonget(struct uvm_faultinfo *ufi, struct vm_amap *amap,
+    struct vm_anon *anon)
+{
+       bool we_own;    /* we own anon's page? */
+       bool locked;    /* did we relock? */
+       struct vm_page *pg;
+       int error;
+       UVMHIST_FUNC("uvmfault_anonget"); UVMHIST_CALLED(maphist);
+
+       KASSERT(mutex_owned(&anon->an_lock));
+
+       error = 0;
+       uvmexp.fltanget++;
+        /* bump rusage counters */
+       if (anon->an_page)
+               curlwp->l_ru.ru_minflt++;



Home | Main Index | Thread Index | Old Index