tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Fixing pool_cache_invalidate(9), final step



Dear all,

So here we are again: I would like to fix the pool_cache_invalidate(9)
logic once and for all. This has been on my TODO list for quite
sometime, and I am tired of using a private ugly patch of mine to fix a
shortcoming of the API (I would even consider it a bug given documentation).

Summary:

- NetBSD 6: I would like to modify the pool_cache_invalidate(9) routine
so that it can invalidate caches of curcpu(). This will not really fix
the call per-see, but at least any caller is free to issue a
pool_cache_invalidate() xcall(9) so that all per-CPU cached objects are
invalidated. This is not currently possible with the -6 implementation.
This CAVEAT is documented. See nb6-pcache-invalidate.diff.

-current: I would like to take a more radical step and perform the
xcall(9) directly from pool_cache_invalidate(9). Given that xcall(9)
broadcasts cannot be sent from interrupt or softint context (because of
the use of condvars), this change is likely to expose bugs in code paths
where invalidation is handled from interrupt context (which is a bad
thing IMHO). See current-pcache-invalidate.diff.

I believe that para@ big work in UVM/kmem fixed most of these
invalidations happening from interrupt/softint context (especially for
kva drains and pmap_growkernel), but there may be some remaining
elsewhere. Nothing obvious comes out from nxr.n.o or custom assertions I
added to my kernels, so I suppose that bug reports and PR will be the
main source of tracking (and I will take a look and patch them when they
are trivial, but will definitely need a hand if it affects tty/network
code :) ).

I run multiple hosts with this patch for 2 months now, but given the
amount of memory they have I am not really hammering on all possible
code paths. Still, no panic related to this change.

While we are at it, I would like to make the routine asynchronous for
-current, ie. it broadcasts the invalidation but do not wait for all
CPUs to finish executing the xcall(9). Reason behind is that each CPU is
responsible for the invalidation of its objects, and should not try to
feed them back to the global pool before the invalidation caller may
finally invalidate them globally. Waste of time, especially as these
objects can be freed safely (ctor/dtor shall implement their own
internal locking, see pool_cache(9)).

For synchronous invalidation I will add pool_cache_invalidate_sync() in
case someone needs it (I do with Xen), but I am not too fond of the
*_sync() name. Suggestions gladly appreciated.

Comments/questions on the patches are welcome.

For those willing to read the initial (and a bit outdated) discussion,
see http://mail-index.netbsd.org/current-users/2011/09/24/msg017862.html

-- 
Jean-Yves Migeon
jym%NetBSD.org@localhost
Index: sys/kern/subr_pool.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
retrieving revision 1.195
diff -u -p -r1.195 subr_pool.c
--- sys/kern/subr_pool.c        5 May 2012 19:15:10 -0000       1.195
+++ sys/kern/subr_pool.c        29 May 2012 18:36:42 -0000
@@ -191,7 +191,8 @@ static bool pool_cache_get_slow(pool_cac
 static void    pool_cache_cpu_init1(struct cpu_info *, pool_cache_t);
 static void    pool_cache_invalidate_groups(pool_cache_t, pcg_t *);
 static void    pool_cache_invalidate_cpu(pool_cache_t, u_int);
-static void    pool_cache_xcall(pool_cache_t);
+static void    pool_cache_invalidate_xcall(pool_cache_t);
+static void    pool_cache_transfer_xcall(pool_cache_t);
 
 static int     pool_catchup(struct pool *);
 static void    pool_prime_page(struct pool *, void *,
@@ -1425,7 +1426,7 @@ pool_drain_start(struct pool **ppp, uint
        /* If there is a pool_cache, drain CPU level caches. */
        *ppp = pp;
        if (pp->pr_cache != NULL) {
-               *wp = xc_broadcast(0, (xcfunc_t)pool_cache_xcall,
+               *wp = xc_broadcast(0, (xcfunc_t)pool_cache_transfer_xcall,
                    pp->pr_cache, NULL);
        }
 }
@@ -2012,26 +2013,13 @@ void
 pool_cache_invalidate(pool_cache_t pc)
 {
        pcg_t *full, *empty, *part;
-#if 0
        uint64_t where;
 
-       if (ncpu < 2 || !mp_online) {
-               /*
-                * We might be called early enough in the boot process
-                * for the CPU data structures to not be fully initialized.
-                * In this case, simply gather the local CPU's cache now
-                * since it will be the only one running.
-                */
-               pool_cache_xcall(pc);
-       } else {
-               /*
-                * Gather all of the CPU-specific caches into the
-                * global cache.
-                */
-               where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
-               xc_wait(where);
-       }
-#endif
+       /*
+        * New objects will come from the global cache if the per-CPU cache
+        * is empty. Invalidate this cache first so it can start being filled
+        * earlier with renewed objects.
+        */
        mutex_enter(&pc->pc_lock);
        full = pc->pc_fullgroups;
        empty = pc->pc_emptygroups;
@@ -2047,6 +2035,25 @@ pool_cache_invalidate(pool_cache_t pc)
        pool_cache_invalidate_groups(pc, full);
        pool_cache_invalidate_groups(pc, empty);
        pool_cache_invalidate_groups(pc, part);
+
+       if (ncpu < 2 || !mp_online) {
+               /*
+                * We might be called early enough in the boot process
+                * for the CPU data structures to not be fully initialized.
+                * In this case, invalidate the local CPU's cache now
+                * since it will be the only one running.
+                */
+               pool_cache_invalidate_xcall(pc);
+       } else {
+               /*
+                * Signal all CPUs that they must invalidate their local
+                * cache for this pool.
+                */
+               where = xc_broadcast(0, (xcfunc_t)pool_cache_invalidate_xcall,
+                   pc, NULL);
+       }
+
+
 }
 
 /*
@@ -2415,13 +2422,31 @@ pool_cache_put_paddr(pool_cache_t pc, vo
 }
 
 /*
+ * pool_cache_invalidate_xcall:
+ *
+ *     Invalidate local CPU caches for pool pc.
+ *     Run within a CPU-bound cross-call thread.
+ */
+static void
+pool_cache_invalidate_xcall(pool_cache_t pc)
+{
+       int s;
+
+       /* block interrupts and disable preemption */
+       s = splvm();
+       pool_cache_invalidate_cpu(pc, curcpu()->ci_index);
+       splx(s);
+}
+
+
+/*
  * pool_cache_xcall:
  *
  *     Transfer objects from the per-CPU cache to the global cache.
  *     Run within a cross-call thread.
  */
 static void
-pool_cache_xcall(pool_cache_t pc)
+pool_cache_transfer_xcall(pool_cache_t pc)
 {
        pool_cache_cpu_t *cc;
        pcg_t *prev, *cur, **list;
Index: sys/kern/subr_pool.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
retrieving revision 1.194
diff -u -p -r1.194 subr_pool.c
--- sys/kern/subr_pool.c        4 Feb 2012 22:11:42 -0000       1.194
+++ sys/kern/subr_pool.c        29 May 2012 18:36:34 -0000
@@ -2244,27 +2244,14 @@ pool_cache_invalidate_groups(pool_cache_
 void
 pool_cache_invalidate(pool_cache_t pc)
 {
+       int s;
        pcg_t *full, *empty, *part;
-#if 0
-       uint64_t where;
 
-       if (ncpu < 2 || !mp_online) {
-               /*
-                * We might be called early enough in the boot process
-                * for the CPU data structures to not be fully initialized.
-                * In this case, simply gather the local CPU's cache now
-                * since it will be the only one running.
-                */
-               pool_cache_xcall(pc);
-       } else {
-               /*
-                * Gather all of the CPU-specific caches into the
-                * global cache.
-                */
-               where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
-               xc_wait(where);
-       }
-#endif
+       /*
+        * New objects will come from the global cache if the per-CPU cache
+        * is empty. Invalidate this cache first so it can start being filled
+        * earlier with renewed objects.
+        */
        mutex_enter(&pc->pc_lock);
        full = pc->pc_fullgroups;
        empty = pc->pc_emptygroups;
@@ -2280,6 +2267,16 @@ pool_cache_invalidate(pool_cache_t pc)
        pool_cache_invalidate_groups(pc, full);
        pool_cache_invalidate_groups(pc, empty);
        pool_cache_invalidate_groups(pc, part);
+
+       /* block interrupts and disable preemption */
+       s = splvm();
+       /*
+        * Invalidate objects in the curcpu() cache. Note that we are not
+        * invalidating objects cached by other CPUs, this requires a local
+        * call to pool_cache_invalidate() for each one of them.
+        */
+       pool_cache_invalidate_cpu(pc, curcpu()->ci_index);
+       splx(s);
 }
 
 /*
Index: share/man/man9/pool_cache.9
===================================================================
RCS file: /cvsroot/src/share/man/man9/pool_cache.9,v
retrieving revision 1.19
diff -u -p -r1.19 pool_cache.9
--- share/man/man9/pool_cache.9 15 Nov 2011 00:32:34 -0000      1.19
+++ share/man/man9/pool_cache.9 29 May 2012 18:36:38 -0000
@@ -360,3 +360,13 @@ subsystem is implemented within the file
 .Xr memoryallocators 9 ,
 .Xr percpu 9 ,
 .Xr pool 9
+.\" ------------------------------------------------------------
+.Sh CAVEATS
+.Fn pool_cache_invalidate
+only affects objects safely accessible by the local CPU.
+On multiprocessor systems this function should be called by each CPU to
+invalidate their local caches.
+See
+.Xr xcall 9
+for an interface to schedule the execution of arbitrary functions
+to any other CPU.


Home | Main Index | Thread Index | Old Index