On Fri, 9 Oct 2009 00:28:56 -0700, Jason Thorpe <thorpej%shagadelic.org@localhost
On Oct 8, 2009, at 11:52 PM, Jean-Yves Migeon wrote:
Consider the case of an mbuf. It is allocated from a per-CPU pcg on
and subsequently bounces around the networking stack, through socket
buffers, etc. and is finally freed. But the processor that frees it
be CPU-B (for any number of legitimate reasons), thus is goes into
CPU-B's per-CPU pcgs.
In this case, mbuf is used by CPU-B. You will not find it inside CPU-
cache. pool_cache_invalidate_local() invoked on CPU-A will not touch
CPU-A pool_cache does not hold any reference to it.
The same situation will happen if you pool_cache_destroy() the pool
from CPU-B. Except that, in this case, CPU-B _will_ destroy objects
by pool_cache CPU-A, while pool_cache_invalidate_local() won't. It is
caller's responsibility to ensure that such situation never occurs.
I need a way to invalidate all pool caches, even those that are
CPU-bound. pool_cache_invalidate() only does that for the global
as it cannot invalidate CPU caches for CPUs other than its own.
CPU-bound is the wrong term here. Nothing is "bound" to the CPU
a set of pcgs that cache constructed objects. The objects are merely
in CPU-local caches in order to provide a lockless allocation path
common case. The objects themselves are not bound to any one CPU.
Depends on the way you see it. When an object is put back in a CPU-
cache, it is bound to this CPU. No other CPU could allocate this
(pool_cache will not allow CPU-A to allocate objects cached in CPU-B
I am not interested on the owner of objects when currently in use, but
those that are currently released but not "freed" (those cached in the
Before invalidate_local(), the only way would be to destroy the
pool_cache entirely, just to release the objects found in pc_cpus.
would cripple down the entire VM system, as it cannot work without
shadow page pool.
What do you mean "destroy the pool_cache entirely"?
Invoke pool_cache_destroy() on it.
So you're just working
around a bug in pool_cache_invalidate()? If pool_cache_invalidate
also zapping the per-CPU pcgs for that pool_cache, then that is a
It cannot do so safely. pool_cache_invalidate() can be called
want, as the global cache is protected by a mutex.
Per-CPU caches are not, due to lockless allocation. If CPU-A
the per-CPU caches of CPU-B while CPU-B is concurrently running a
pool_cache_get(), this could result in unspecified behavior.
main intent of pool_cache_invalidate() is to nuke any cached
copies of an object if the constructed form were to change for some
If per-CPU cached copies are not included in that, then a subsequent
allocation could return an incorrectly-constructed object, leading to
- pool_cache(9) does not state that:
Destruct and release all objects in the global cache. Per-CPU
not be invalidated by this call, meaning that it is still possible to
allocate "stale" items from the cache. If relevant, the user must
for this condition when allocating items.
There are subsystems in the kernel that depend on the
pool_cache_invalidate() semantics I describe; see
arch/alpha/alpha/pmap.c:pmap_growkernel() for an example. The L1
cached in constructed form (with the L1 PTEs for the kernel portion
address space already initialized). If PTPs are added to the
in such a way as to require an additional L1 PTE to link them up,
already-constructed-but-free L1 PTPs need to be discarded since
be missing part of the kernel's address space.
In the current form, that is correct. pool_cache_invalidate() will not
release _all_ objects, only global cached ones. Stale objects in per-
pcgs will not be freed, nor updated. In Alpha's pmap_growkernel
you end up desyncing your PTPs.
This is wrong, and needs to be fixed, through
for each CPU (or an equivalent way to do it, I am opened to
Sigh, really the per-CPU pcgs should not be "lockless", but rather
be mutex-proected but "never contended" ... just in case you DO
manipulate them from another CPU (this is how Solaris's kmem_cache
at least it did at one point).
This is more intrusive in pool_cache(9). IMHO, protecting the per-
with mutexes that will very rarely have contention (system is
pass a lot more time allocating from these pcgs than asking for their
invalidation from another CPU...) is overkill.
If there is a lockless ("or contention-free mutex") way to do it,
But this will alter a performance-critical code path, and should be
out very carefully.