Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/netbsd-8]: src/sys Pull up following revision(s) (requested by mrg in ti...



details:   https://anonhg.NetBSD.org/src/rev/e11183387988
branches:  netbsd-8
changeset: 851428:e11183387988
user:      martin <martin%NetBSD.org@localhost>
date:      Tue Feb 27 09:07:32 2018 +0000

description:
Pull up following revision(s) (requested by mrg in ticket #593):
        sys/dev/marvell/mvxpsec.c: revision 1.2
        sys/arch/m68k/m68k/pmap_motorola.c: revision 1.70
        sys/opencrypto/crypto.c: revision 1.102
        sys/arch/sparc64/sparc64/pmap.c: revision 1.308
        sys/ufs/chfs/chfs_malloc.c: revision 1.5
        sys/arch/powerpc/oea/pmap.c: revision 1.95
        sys/sys/pool.h: revision 1.80,1.82
        sys/kern/subr_pool.c: revision 1.209-1.216,1.219-1.220
        sys/arch/alpha/alpha/pmap.c: revision 1.262
        sys/kern/uipc_mbuf.c: revision 1.173
        sys/uvm/uvm_fault.c: revision 1.202
        sys/sys/mbuf.h: revision 1.172
        sys/kern/subr_extent.c: revision 1.86
        sys/arch/x86/x86/pmap.c: revision 1.266 (via patch)
        sys/dev/dtv/dtv_scatter.c: revision 1.4

Allow only one pending call to a pool's backing allocator at a time.
Candidate fix for problems with hanging after kva fragmentation related
to PR kern/45718.

Proposed on tech-kern:
https://mail-index.NetBSD.org/tech-kern/2017/10/23/msg022472.html
Tested by bouyer@ on i386.

This makes one small change to the semantics of pool_prime and
pool_setlowat: they may fail with EWOULDBLOCK instead of ENOMEM, if
there is a pending call to the backing allocator in another thread but
we are not actually out of memory.  That is unlikely because nearly
always these are used during initialization, when the pool is not in
use.

Define the new flag too for previous commit.

pool_grow can now fail even when sleeping is ok. Catch this case in pool_get
and retry.

Assert that pool_get failure happens only with PR_NOWAIT.
This would have caught the mistake I made last week leading to null
pointer dereferences all over the place, a mistake which I evidently
poorly scheduled alongside maxv's change to the panic message on x86
for null pointer dereferences.

Since pr_lock is now used to wait for two things now (PR_GROWING and
PR_WANTED) we need to loop for the condition we wanted.
make the KASSERTMSG/panic strings consistent as '%s: [%s], __func__, wchan'
Handle the ERESTART case from pool_grow()

don't pass 0 to the pool flags
Guess pool_cache_get(pc, 0) means PR_WAITOK here.
Earlier on in the same context we use kmem_alloc(sz, KM_SLEEP).

use PR_WAITOK everywhere.
use PR_NOWAIT.

Don't use 0 for PR_NOWAIT

use PR_NOWAIT instead of 0

panic ex nihilo -- PR_NOWAITing for zerot

Add assertions that either PR_WAITOK or PR_NOWAIT are set.
- fix an assert; we can reach there if we are nowait or limitfail.
- when priming the pool and failing with ERESTART, don't decrement the number
  of pages; this avoids the issue of returning an ERESTART when we get to 0,
  and is more correct.
- simplify the pool_grow code, and don't wakeup things if we ENOMEM.

In pmap_enter_ma(), only try to allocate pves if we might need them,
and even if that fails, only fail the operation if we later discover
that we really do need them.  This implements the requirement that
pmap_enter(PMAP_CANFAIL) must not fail when replacing an existing
mapping with the first mapping of a new page, which is an unintended
consequence of the changes from the rmind-uvmplock branch in 2011.

The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing
pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write).
If that fails and leaves the old pmap entry in place, then UVM won't hold
the right locks when it eventually retries.  This entanglement of the UVM and
pmap locking was done in rmind-uvmplock in order to improve performance,
but it also means that the UVM state and pmap state need to be kept in sync
more than they did before.  It would be possible to handle this in the UVM code
instead of in the pmap code, but these pmap changes improve the handling of
low memory situations in general, and handling this in UVM would be clunky,
so this seemed like the better way to go.

This somewhat indirectly fixes PR 52706, as well as the failing assertion
about "uvm_page_locked_p(old_pg)".  (but only on x86, various other platforms
will need their own changes to handle this issue.)
In uvm_fault_upper_enter(), if pmap_enter(PMAP_CANFAIL) fails, assert that
the pmap did not leave around a now-stale pmap mapping for an old page.
If such a pmap mapping still existed after we unlocked the vm_map,
the UVM code would not know later that it would need to lock the
lower layer object while calling the pmap to remove or replace that
stale pmap mapping.  See PR 52706 for further details.
hopefully workaround the irregularly "fork fails in init" problem.
if a pool is growing, and the grower is PR_NOWAIT, mark this.
if another caller wants to grow the pool and is also PR_NOWAIT,
busy-wait for the original caller, which should either succeed
or hard-fail fairly quickly.

implement the busy-wait by unlocking and relocking this pools
mutex and returning ERESTART.  other methods (such as having
the caller do this) were significantly more code and this hack
is fairly localised.
ok chs@ riastradh@

Don't release the lock in the PR_NOWAIT allocation. Move flags setting
after the acquiring the mutex. (from Tobias Nygren)
apply the change from arch/x86/x86/pmap.c rev. 1.266 commitid vZRjvmxG7YTHLOfA:

In pmap_enter_ma(), only try to allocate pves if we might need them,
and even if that fails, only fail the operation if we later discover
that we really do need them.  If we are replacing an existing mapping,
reuse the pv structure where possible.

This implements the requirement that pmap_enter(PMAP_CANFAIL) must not fail
when replacing an existing mapping with the first mapping of a new page,
which is an unintended consequence of the changes from the rmind-uvmplock
branch in 2011.

The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing
pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write).
If that fails and leaves the old pmap entry in place, then UVM won't hold
the right locks when it eventually retries.  This entanglement of the UVM and
pmap locking was done in rmind-uvmplock in order to improve performance,
but it also means that the UVM state and pmap state need to be kept in sync
more than they did before.  It would be possible to handle this in the UVM code
instead of in the pmap code, but these pmap changes improve the handling of
low memory situations in general, and handling this in UVM would be clunky,
so this seemed like the better way to go.

This somewhat indirectly fixes PR 52706 on the remaining platforms where
this problem existed.

diffstat:

 sys/arch/alpha/alpha/pmap.c        |   54 +++++++----
 sys/arch/m68k/m68k/pmap_motorola.c |   46 +++++++---
 sys/arch/powerpc/oea/pmap.c        |   18 ++-
 sys/arch/sparc64/sparc64/pmap.c    |   84 ++++++++-----------
 sys/arch/x86/x86/pmap.c            |   77 +++++++++++++----
 sys/dev/dtv/dtv_scatter.c          |    6 +-
 sys/dev/marvell/mvxpsec.c          |    6 +-
 sys/kern/subr_extent.c             |    6 +-
 sys/kern/subr_pool.c               |  161 +++++++++++++++++++++++++-----------
 sys/kern/uipc_mbuf.c               |    6 +-
 sys/opencrypto/crypto.c            |    8 +-
 sys/sys/mbuf.h                     |    4 +-
 sys/sys/pool.h                     |    4 +-
 sys/ufs/chfs/chfs_malloc.c         |   14 +-
 sys/uvm/uvm_fault.c                |   30 +++++-
 15 files changed, 332 insertions(+), 192 deletions(-)

diffs (truncated from 1404 to 300 lines):

diff -r 7ca1458ec1d3 -r e11183387988 sys/arch/alpha/alpha/pmap.c
--- a/sys/arch/alpha/alpha/pmap.c       Tue Feb 27 06:07:28 2018 +0000
+++ b/sys/arch/alpha/alpha/pmap.c       Tue Feb 27 09:07:32 2018 +0000
@@ -1,4 +1,4 @@
-/* $NetBSD: pmap.c,v 1.261 2016/12/23 07:15:27 cherry Exp $ */
+/* $NetBSD: pmap.c,v 1.261.8.1 2018/02/27 09:07:33 martin Exp $ */
 
 /*-
  * Copyright (c) 1998, 1999, 2000, 2001, 2007, 2008 The NetBSD Foundation, Inc.
@@ -140,7 +140,7 @@
 
 #include <sys/cdefs.h>                 /* RCS ID & Copyright macro defns */
 
-__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.261 2016/12/23 07:15:27 cherry Exp $");
+__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.261.8.1 2018/02/27 09:07:33 martin Exp $");
 
 #include <sys/param.h>
 #include <sys/systm.h>
@@ -439,7 +439,8 @@
  * Internal routines
  */
 static void    alpha_protection_init(void);
-static bool    pmap_remove_mapping(pmap_t, vaddr_t, pt_entry_t *, bool, long);
+static bool    pmap_remove_mapping(pmap_t, vaddr_t, pt_entry_t *, bool, long,
+                                   pv_entry_t *);
 static void    pmap_changebit(struct vm_page *, pt_entry_t, pt_entry_t, long);
 
 /*
@@ -466,8 +467,9 @@
  * PV table management functions.
  */
 static int     pmap_pv_enter(pmap_t, struct vm_page *, vaddr_t, pt_entry_t *,
-                             bool);
-static void    pmap_pv_remove(pmap_t, struct vm_page *, vaddr_t, bool);
+                             bool, pv_entry_t);
+static void    pmap_pv_remove(pmap_t, struct vm_page *, vaddr_t, bool,
+                              pv_entry_t *);
 static void    *pmap_pv_page_alloc(struct pool *, int);
 static void    pmap_pv_page_free(struct pool *, void *);
 
@@ -1266,7 +1268,7 @@
                                            sva);
 #endif
                                needisync |= pmap_remove_mapping(pmap, sva,
-                                   l3pte, true, cpu_id);
+                                   l3pte, true, cpu_id, NULL);
                        }
                        sva += PAGE_SIZE;
                }
@@ -1343,7 +1345,7 @@
                                                    pmap_remove_mapping(
                                                        pmap, sva,
                                                        l3pte, true,
-                                                       cpu_id);
+                                                       cpu_id, NULL);
                                        }
 
                                        /*
@@ -1450,7 +1452,7 @@
                        panic("pmap_page_protect: bad mapping");
 #endif
                if (pmap_remove_mapping(pmap, pv->pv_va, pv->pv_pte,
-                   false, cpu_id) == true) {
+                   false, cpu_id, NULL)) {
                        if (pmap == pmap_kernel())
                                needkisync |= true;
                        else
@@ -1558,6 +1560,7 @@
 {
        struct vm_page *pg;                     /* if != NULL, managed page */
        pt_entry_t *pte, npte, opte;
+       pv_entry_t opv = NULL;
        paddr_t opa;
        bool tflush = true;
        bool hadasm = false;    /* XXX gcc -Wuninitialized */
@@ -1750,14 +1753,15 @@
                 */
                pmap_physpage_addref(pte);
        }
-       needisync |= pmap_remove_mapping(pmap, va, pte, true, cpu_id);
+       needisync |= pmap_remove_mapping(pmap, va, pte, true, cpu_id, &opv);
 
  validate_enterpv:
        /*
         * Enter the mapping into the pv_table if appropriate.
         */
        if (pg != NULL) {
-               error = pmap_pv_enter(pmap, pg, va, pte, true);
+               error = pmap_pv_enter(pmap, pg, va, pte, true, opv);
+               opv = NULL;
                if (error) {
                        pmap_l3pt_delref(pmap, va, pte, cpu_id);
                        if (flags & PMAP_CANFAIL)
@@ -1845,6 +1849,8 @@
 out:
        PMAP_UNLOCK(pmap);
        PMAP_MAP_TO_HEAD_UNLOCK();
+       if (opv)
+               pmap_pv_free(opv);
        
        return error;
 }
@@ -2422,7 +2428,7 @@
  */
 static bool
 pmap_remove_mapping(pmap_t pmap, vaddr_t va, pt_entry_t *pte,
-    bool dolock, long cpu_id)
+    bool dolock, long cpu_id, pv_entry_t *opvp)
 {
        paddr_t pa;
        struct vm_page *pg;             /* if != NULL, page is managed */
@@ -2434,8 +2440,8 @@
 
 #ifdef DEBUG
        if (pmapdebug & (PDB_FOLLOW|PDB_REMOVE|PDB_PROTECT))
-               printf("pmap_remove_mapping(%p, %lx, %p, %d, %ld)\n",
-                      pmap, va, pte, dolock, cpu_id);
+               printf("pmap_remove_mapping(%p, %lx, %p, %d, %ld, %p)\n",
+                      pmap, va, pte, dolock, cpu_id, opvp);
 #endif
 
        /*
@@ -2511,7 +2517,8 @@
         */
        pg = PHYS_TO_VM_PAGE(pa);
        KASSERT(pg != NULL);
-       pmap_pv_remove(pmap, pg, va, dolock);
+       pmap_pv_remove(pmap, pg, va, dolock, opvp);
+       KASSERT(opvp == NULL || *opvp != NULL);
 
        return (needisync);
 }
@@ -2765,18 +2772,19 @@
  */
 static int
 pmap_pv_enter(pmap_t pmap, struct vm_page *pg, vaddr_t va, pt_entry_t *pte,
-    bool dolock)
+    bool dolock, pv_entry_t newpv)
 {
        struct vm_page_md * const md = VM_PAGE_TO_MD(pg);
-       pv_entry_t newpv;
        kmutex_t *lock;
 
        /*
         * Allocate and fill in the new pv_entry.
         */
-       newpv = pmap_pv_alloc();
-       if (newpv == NULL)
-               return ENOMEM;
+       if (newpv == NULL) {
+               newpv = pmap_pv_alloc();
+               if (newpv == NULL)
+                       return ENOMEM;
+       }
        newpv->pv_va = va;
        newpv->pv_pmap = pmap;
        newpv->pv_pte = pte;
@@ -2820,7 +2828,8 @@
  *     Remove a physical->virtual entry from the pv_table.
  */
 static void
-pmap_pv_remove(pmap_t pmap, struct vm_page *pg, vaddr_t va, bool dolock)
+pmap_pv_remove(pmap_t pmap, struct vm_page *pg, vaddr_t va, bool dolock,
+       pv_entry_t *opvp)
 {
        struct vm_page_md * const md = VM_PAGE_TO_MD(pg);
        pv_entry_t pv, *pvp;
@@ -2852,7 +2861,10 @@
                mutex_exit(lock);
        }
 
-       pmap_pv_free(pv);
+       if (opvp != NULL)
+               *opvp = pv;
+       else
+               pmap_pv_free(pv);
 }
 
 /*
diff -r 7ca1458ec1d3 -r e11183387988 sys/arch/m68k/m68k/pmap_motorola.c
--- a/sys/arch/m68k/m68k/pmap_motorola.c        Tue Feb 27 06:07:28 2018 +0000
+++ b/sys/arch/m68k/m68k/pmap_motorola.c        Tue Feb 27 09:07:32 2018 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: pmap_motorola.c,v 1.69 2016/12/23 07:15:27 cherry Exp $        */
+/*     $NetBSD: pmap_motorola.c,v 1.69.8.1 2018/02/27 09:07:32 martin Exp $        */
 
 /*-
  * Copyright (c) 1999 The NetBSD Foundation, Inc.
@@ -119,7 +119,7 @@
 #include "opt_m68k_arch.h"
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: pmap_motorola.c,v 1.69 2016/12/23 07:15:27 cherry Exp $");
+__KERNEL_RCSID(0, "$NetBSD: pmap_motorola.c,v 1.69.8.1 2018/02/27 09:07:32 martin Exp $");
 
 #include <sys/param.h>
 #include <sys/systm.h>
@@ -306,7 +306,8 @@
 /*
  * Internal routines
  */
-void   pmap_remove_mapping(pmap_t, vaddr_t, pt_entry_t *, int);
+void   pmap_remove_mapping(pmap_t, vaddr_t, pt_entry_t *, int,
+                           struct pv_entry **);
 bool   pmap_testbit(paddr_t, int);
 bool   pmap_changebit(paddr_t, int, int);
 int    pmap_enter_ptpage(pmap_t, vaddr_t, bool);
@@ -843,7 +844,7 @@
                                }
                                firstpage = false;
 #endif
-                               pmap_remove_mapping(pmap, sva, pte, flags);
+                               pmap_remove_mapping(pmap, sva, pte, flags, NULL);
                        }
                        pte++;
                        sva += PAGE_SIZE;
@@ -929,7 +930,7 @@
                        panic("pmap_page_protect: bad mapping");
 #endif
                pmap_remove_mapping(pv->pv_pmap, pv->pv_va,
-                   pte, PRM_TFLUSH|PRM_CFLUSH);
+                   pte, PRM_TFLUSH|PRM_CFLUSH, NULL);
        }
        splx(s);
 }
@@ -1048,6 +1049,7 @@
 pmap_enter(pmap_t pmap, vaddr_t va, paddr_t pa, vm_prot_t prot, u_int flags)
 {
        pt_entry_t *pte;
+       struct pv_entry *opv = NULL;
        int npte;
        paddr_t opa;
        bool cacheable = true;
@@ -1130,7 +1132,7 @@
                PMAP_DPRINTF(PDB_ENTER,
                    ("enter: removing old mapping %lx\n", va));
                pmap_remove_mapping(pmap, va, pte,
-                   PRM_TFLUSH|PRM_CFLUSH|PRM_KEEPPTPAGE);
+                   PRM_TFLUSH|PRM_CFLUSH|PRM_KEEPPTPAGE, &opv);
        }
 
        /*
@@ -1179,7 +1181,12 @@
                                if (pmap == npv->pv_pmap && va == npv->pv_va)
                                        panic("pmap_enter: already in pv_tab");
 #endif
-                       npv = pmap_alloc_pv();
+                       if (opv != NULL) {
+                               npv = opv;
+                               opv = NULL;
+                       } else {
+                               npv = pmap_alloc_pv();
+                       }
                        KASSERT(npv != NULL);
                        npv->pv_va = va;
                        npv->pv_pmap = pmap;
@@ -1346,6 +1353,9 @@
                pmap_check_wiring("enter", trunc_page((vaddr_t)pte));
 #endif
 
+       if (opv != NULL)
+               pmap_free_pv(opv);
+
        return 0;
 }
 
@@ -1659,7 +1669,7 @@
 
                (void) pmap_extract(pmap, pv->pv_va, &kpa);
                pmap_remove_mapping(pmap, pv->pv_va, NULL,
-                   PRM_TFLUSH|PRM_CFLUSH);
+                   PRM_TFLUSH|PRM_CFLUSH, NULL);
 
                /*
                 * Use the physical address to locate the original
@@ -1970,11 +1980,12 @@
  */
 /* static */
 void
-pmap_remove_mapping(pmap_t pmap, vaddr_t va, pt_entry_t *pte, int flags)
+pmap_remove_mapping(pmap_t pmap, vaddr_t va, pt_entry_t *pte, int flags,
+    struct pv_entry **opvp)
 {
        paddr_t pa;
        struct pv_header *pvh;
-       struct pv_entry *pv, *npv;
+       struct pv_entry *pv, *npv, *opv = NULL;
        struct pmap *ptpmap;
        st_entry_t *ste;
        int s, bits;
@@ -1983,8 +1994,8 @@
 #endif
 
        PMAP_DPRINTF(PDB_FOLLOW|PDB_REMOVE|PDB_PROTECT,
-           ("pmap_remove_mapping(%p, %lx, %p, %x)\n",
-           pmap, va, pte, flags));
+           ("pmap_remove_mapping(%p, %lx, %p, %x, %p)\n",
+           pmap, va, pte, flags, opvp));
 
        /*



Home | Main Index | Thread Index | Old Index