tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Fixing pool_cache_invalidate(9), final step



On Thu, 31 May 2012 12:55:42 +0100, Jean-Yves Migeon wrote:
if my grepping is right pool_reclaim/pool_cache_reclaim are only user in subr_pool.c (for _HARDKERNEL) so draining should indeed only happen from the pageout daemon and not from interrupt context at all, so we
should (if I'm not mistaken) be able to simplify be integrating
pool_drain_start/end and calling the pool_cache_xcall from
pool_cache_invalidate...
currently with the split pool_drain_start/end the time the xcall runs might on SMP be used to drain the buffer pool, the only different that
comes to my mind quickly.

I came to the same conclusion while looking at the drain_start/_end
functions, but they were outside the scope of my patch (I expected
oppositions, so kept it as small as possible).

I think this comes as a package ;-)

Just some thoughts,

Will do, however I cannot make such an API change for -6, even if it
does not appear in the documentation (I can still ask releng@ and
core@, but I highly doubt that this is worth the hassle). I can do
this for -current though.

I will update both patches accordingly, but the drain start/end split
will likely remain for -6.

Here are the new proposals, taking Lars' suggestions into account.

[General]
- Dumped pool_cache_invalidate_cpu() in favor of pool_cache_xcall() which transfers CPU-bound objects back to the global pool.
- Adapt comments accordingly.

[-Current]
- pool_cache_invalidate() will now schedule a xcall(9) so that CPUs will feed back their objects to the global pool before invalidation. The function remains synchronous because we have to wait for all CPUs to finish executing the xcall before we can start invalidation. - the pool_drain_start/_end functions have been merged now that pool_cache_invalidate() drains per-CPU caches synchronously. The new pool_drain() function is

"bool pool_drain(struct pool **ppp)", with:
- bool indicating whether reclaiming was fully done (true) or not (false), - ppp will contain a pointer to the pool that was drained (optional argument).

[NetBSD-6]
- pool_cache_invalidate() will only feed back objects cached by the calling CPU. This will not invalidate _all_ CPUs bound objects (the caller has to explicitly schedule a xcall(9) for !curcpu() CPUs), but at least it has a way to invalidate them now.
- reflect this in the CAVEATS in pool_cache(9) documentation.

[NetBSD-5]
In case someone think that it would be reasonable to backport the -6 fix to -5 (or at least document the CAVEAT inside pool_cache(9)), ping me.


Full release build for -nb6 and -current has been done over the week end, and two ATF runs for each (one with 1GiB memory, the other with "just" 64MiB to put a bit of pressure on pgdaemon). No regression observed.

I am planning to commit these at the end of the week, and submit the NetBSD-6 patch to releng today.

Comments :) ?

--
Jean-Yves Migeon
jeanyves.migeon%free.fr@localhost
? kern_rebuild.sh
? release_rebuild.sh
? vbox_build.sh
Index: external/cddl/osnet/sys/kern/misc.c
===================================================================
RCS file: /cvsroot/src/external/cddl/osnet/sys/kern/misc.c,v
retrieving revision 1.3
diff -u -p -r1.3 misc.c
--- external/cddl/osnet/sys/kern/misc.c 10 Mar 2011 19:35:24 -0000      1.3
+++ external/cddl/osnet/sys/kern/misc.c 4 Jun 2012 09:00:31 -0000
@@ -133,15 +133,8 @@ void
 kmem_reap(void)
 {
        int bufcnt;
-       uint64_t where;
        struct pool *pp;
        
-       /*
-        * start draining pool resources now that we're not
-        * holding any locks.
-        */
-       pool_drain_start(&pp, &where);
-
        bufcnt = uvmexp.freetarg - uvmexp.free;
        if (bufcnt < 0)
                bufcnt = 0;
@@ -153,9 +146,9 @@ kmem_reap(void)
        mutex_exit(&bufcache_lock);
 
        /*
-        * complete draining the pools.
+        * drain the pools.
         */
-       pool_drain_end(pp, where);
+       pool_drain(&pp);
 //     printf("XXXNETBSD kmem_reap called, write me\n");
 }
 
Index: sys/kern/subr_pool.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
retrieving revision 1.195
diff -u -p -r1.195 subr_pool.c
--- sys/kern/subr_pool.c        5 May 2012 19:15:10 -0000       1.195
+++ sys/kern/subr_pool.c        4 Jun 2012 09:01:44 -0000
@@ -1300,7 +1300,7 @@ pool_sethardlimit(struct pool *pp, int n
 /*
  * Release all complete pages that have not been used recently.
  *
- * Might be called from interrupt context.
+ * Must not be called from interrupt context.
  */
 int
 pool_reclaim(struct pool *pp)
@@ -1311,9 +1311,7 @@ pool_reclaim(struct pool *pp)
        bool klock;
        int rv;
 
-       if (cpu_intr_p() || cpu_softintr_p()) {
-               KASSERT(pp->pr_ipl != IPL_NONE);
-       }
+       KASSERT(!cpu_intr_p() && !cpu_softintr_p());
 
        if (pp->pr_drain_hook != NULL) {
                /*
@@ -1387,17 +1385,14 @@ pool_reclaim(struct pool *pp)
 }
 
 /*
- * Drain pools, one at a time.  This is a two stage process;
- * drain_start kicks off a cross call to drain CPU-level caches
- * if the pool has an associated pool_cache.  drain_end waits
- * for those cross calls to finish, and then drains the cache
- * (if any) and pool.
+ * Drain pools, one at a time. The drained pool is returned within ppp.
  *
  * Note, must never be called from interrupt context.
  */
-void
-pool_drain_start(struct pool **ppp, uint64_t *wp)
+bool
+pool_drain(struct pool **ppp)
 {
+       bool reclaimed;
        struct pool *pp;
 
        KASSERT(!TAILQ_EMPTY(&pool_head));
@@ -1422,28 +1417,6 @@ pool_drain_start(struct pool **ppp, uint
        pp->pr_refcnt++;
        mutex_exit(&pool_head_lock);
 
-       /* If there is a pool_cache, drain CPU level caches. */
-       *ppp = pp;
-       if (pp->pr_cache != NULL) {
-               *wp = xc_broadcast(0, (xcfunc_t)pool_cache_xcall,
-                   pp->pr_cache, NULL);
-       }
-}
-
-bool
-pool_drain_end(struct pool *pp, uint64_t where)
-{
-       bool reclaimed;
-
-       if (pp == NULL)
-               return false;
-
-       KASSERT(pp->pr_refcnt > 0);
-
-       /* Wait for remote draining to complete. */
-       if (pp->pr_cache != NULL)
-               xc_wait(where);
-
        /* Drain the cache (if any) and pool.. */
        reclaimed = pool_reclaim(pp);
 
@@ -1453,6 +1426,9 @@ pool_drain_end(struct pool *pp, uint64_t
        cv_broadcast(&pool_busy);
        mutex_exit(&pool_head_lock);
 
+       if (ppp != NULL)
+               *ppp = pp;
+
        return reclaimed;
 }
 
@@ -2012,26 +1988,28 @@ void
 pool_cache_invalidate(pool_cache_t pc)
 {
        pcg_t *full, *empty, *part;
-#if 0
        uint64_t where;
 
        if (ncpu < 2 || !mp_online) {
                /*
                 * We might be called early enough in the boot process
                 * for the CPU data structures to not be fully initialized.
-                * In this case, simply gather the local CPU's cache now
-                * since it will be the only one running.
+                * In this case, transfer the content of the local CPU's
+                * cache back into global cache as only this CPU is currently
+                * running.
                 */
                pool_cache_xcall(pc);
        } else {
                /*
-                * Gather all of the CPU-specific caches into the
-                * global cache.
+                * Signal all CPUs that they must transfer their local
+                * cache back to the global pool then wait for the xcall to
+                * complete.
                 */
                where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
                xc_wait(where);
        }
-#endif
+
+       /* Empty pool caches, then invalidate objects */
        mutex_enter(&pc->pc_lock);
        full = pc->pc_fullgroups;
        empty = pc->pc_emptygroups;
Index: sys/rump/librump/rumpkern/memalloc.c
===================================================================
RCS file: /cvsroot/src/sys/rump/librump/rumpkern/memalloc.c,v
retrieving revision 1.15
diff -u -p -r1.15 memalloc.c
--- sys/rump/librump/rumpkern/memalloc.c        29 Apr 2012 20:27:32 -0000      
1.15
+++ sys/rump/librump/rumpkern/memalloc.c        4 Jun 2012 09:01:45 -0000
@@ -285,15 +285,8 @@ pool_cache_set_drain_hook(pool_cache_t p
        pc->pc_pool.pr_drain_hook_arg = arg;
 }
 
-void
-pool_drain_start(struct pool **ppp, uint64_t *wp)
-{
-
-       /* nada */
-}
-
 bool
-pool_drain_end(struct pool *pp, uint64_t w)
+pool_drain(struct pool **ppp)
 {
 
        /* can't reclaim anything in this model */
Index: sys/rump/librump/rumpkern/vm.c
===================================================================
RCS file: /cvsroot/src/sys/rump/librump/rumpkern/vm.c,v
retrieving revision 1.126
diff -u -p -r1.126 vm.c
--- sys/rump/librump/rumpkern/vm.c      23 May 2012 14:59:21 -0000      1.126
+++ sys/rump/librump/rumpkern/vm.c      4 Jun 2012 09:01:45 -0000
@@ -989,7 +989,6 @@ uvm_pageout(void *arg)
 {
        struct vm_page *pg;
        struct pool *pp, *pp_first;
-       uint64_t where;
        int cleaned, skip, skipped;
        int waspaging;
        bool succ;
@@ -1094,19 +1093,15 @@ uvm_pageout(void *arg)
                /*
                 * And then drain the pools.  Wipe them out ... all of them.
                 */
-
-               pool_drain_start(&pp_first, &where);
-               pp = pp_first;
+               succ = pool_drain(&pp_first);
                for (;;) {
                        rump_vfs_drainbufs(10 /* XXX: estimate better */);
-                       succ = pool_drain_end(pp, where);
                        if (succ)
                                break;
-                       pool_drain_start(&pp, &where);
-                       if (pp == pp_first) {
-                               succ = pool_drain_end(pp, where);
+
+                       succ = pool_drain(&pp);
+                       if (pp == pp_first)
                                break;
-                       }
                }
 
                /*
Index: sys/sys/pool.h
===================================================================
RCS file: /cvsroot/src/sys/sys/pool.h,v
retrieving revision 1.74
diff -u -p -r1.74 pool.h
--- sys/sys/pool.h      5 May 2012 19:15:10 -0000       1.74
+++ sys/sys/pool.h      4 Jun 2012 09:01:45 -0000
@@ -263,8 +263,7 @@ int         pool_prime(struct pool *, int);
 void           pool_setlowat(struct pool *, int);
 void           pool_sethiwat(struct pool *, int);
 void           pool_sethardlimit(struct pool *, int, const char *, int);
-void           pool_drain_start(struct pool **, uint64_t *);
-bool           pool_drain_end(struct pool *, uint64_t);
+bool           pool_drain(struct pool **);
 
 /*
  * Debugging and diagnostic aides.
Index: sys/uvm/uvm_pdaemon.c
===================================================================
RCS file: /cvsroot/src/sys/uvm/uvm_pdaemon.c,v
retrieving revision 1.105
diff -u -p -r1.105 uvm_pdaemon.c
--- sys/uvm/uvm_pdaemon.c       1 Feb 2012 23:43:49 -0000       1.105
+++ sys/uvm/uvm_pdaemon.c       4 Jun 2012 09:01:46 -0000
@@ -228,7 +228,6 @@ uvm_pageout(void *arg)
        int bufcnt, npages = 0;
        int extrapages = 0;
        struct pool *pp;
-       uint64_t where;
        
        UVMHIST_FUNC("uvm_pageout"); UVMHIST_CALLED(pdhist);
 
@@ -328,12 +327,6 @@ uvm_pageout(void *arg)
                        continue;
 
                /*
-                * start draining pool resources now that we're not
-                * holding any locks.
-                */
-               pool_drain_start(&pp, &where);
-
-               /*
                 * kill unused metadata buffers.
                 */
                mutex_enter(&bufcache_lock);
@@ -341,9 +334,9 @@ uvm_pageout(void *arg)
                mutex_exit(&bufcache_lock);
 
                /*
-                * complete draining the pools.
+                * drain the pools.
                 */
-               pool_drain_end(pp, where);
+               pool_drain(&pp);
        }
        /*NOTREACHED*/
 }
Index: share/man/man9/pool_cache.9
===================================================================
RCS file: /cvsroot/src/share/man/man9/pool_cache.9,v
retrieving revision 1.19
diff -u -p -r1.19 pool_cache.9
--- share/man/man9/pool_cache.9 15 Nov 2011 00:32:34 -0000      1.19
+++ share/man/man9/pool_cache.9 4 Jun 2012 08:38:39 -0000
@@ -360,3 +360,13 @@ subsystem is implemented within the file
 .Xr memoryallocators 9 ,
 .Xr percpu 9 ,
 .Xr pool 9
+.\" ------------------------------------------------------------
+.Sh CAVEATS
+.Fn pool_cache_invalidate
+only affects objects safely accessible by the local CPU.
+On multiprocessor systems this function should be called by each CPU to
+invalidate their local caches.
+See
+.Xr xcall 9
+for an interface to schedule the execution of arbitrary functions
+to any other CPU.
Index: sys/kern/subr_pool.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
retrieving revision 1.194
diff -u -p -r1.194 subr_pool.c
--- sys/kern/subr_pool.c        4 Feb 2012 22:11:42 -0000       1.194
+++ sys/kern/subr_pool.c        4 Jun 2012 08:39:33 -0000
@@ -2245,26 +2245,15 @@ void
 pool_cache_invalidate(pool_cache_t pc)
 {
        pcg_t *full, *empty, *part;
-#if 0
-       uint64_t where;
 
-       if (ncpu < 2 || !mp_online) {
-               /*
-                * We might be called early enough in the boot process
-                * for the CPU data structures to not be fully initialized.
-                * In this case, simply gather the local CPU's cache now
-                * since it will be the only one running.
-                */
-               pool_cache_xcall(pc);
-       } else {
-               /*
-                * Gather all of the CPU-specific caches into the
-                * global cache.
-                */
-               where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
-               xc_wait(where);
-       }
-#endif
+       /*
+        * Transfer the content of the local CPU's cache back into global
+        * cache. Note that this does not handle objects cached for other CPUs.
+        * A xcall(9) must be scheduled to take care of them.
+        */
+       pool_cache_xcall(pc);
+
+       /* Invalidate the global cache. */
        mutex_enter(&pc->pc_lock);
        full = pc->pc_fullgroups;
        empty = pc->pc_emptygroups;


Home | Main Index | Thread Index | Old Index