tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: pool_cache_invalidate(9) (wrong?) semantic -- enable xcall invalidation

On 25.09.2011 01:50, Jean-Yves Migeon wrote:
I would like to turn back on the xcall(9) block found in
pool_cache_invalidate(), now that rmind@ has implemented high priority
cross calls [1].

FWIW, I am still struggling with this, without really having an idea on how to fix this once and for all.

I only investigated two solutions, each one has its own share of problems. Please let me if you see other possibilities:

1 # adding an "invalidate flag" to the pool cache, for each CPU. Asking for an invalidation would increment this flag for all CPUs, and each time one CPU is allocating from the pool_cache(9), check that flag first and force invalidation if necessary.
        + fairly easy to implement, no interlocking (via atomic_ops)
- does not allow for synchronous invalidation, which is the point of pool_cache_invalidate() - sleeping CPUs ("offline") will not necessarily deplete their own pool caches.

2 # schedule a softint/xcall (depending on priority), let each CPU push items back to the main pool, then deplete the main pool.
        + synchronous
        + ensure that all CPU have their pools depleted
- is not interrupt-safe: current xcall(9) require cv_wait() to wait for all xcalls to complete. This is problematic when invalidation is called from softint() context, which seems to happen in the network stack and pmap_growkernel code for x86. Fixing all call sites seems like... a non trivial task.

I am not sure whether "busy waiting" for softints in xc_wait(9) would be appropriate for high priority cross-calls, e.g. instead of cv_wait()ing for the high priority broadcast to finish, we could have a:

/* Slow path: block until awoken. */
        while (xc->xc_donep < where) {
                if (xc == &xc_low_pri) /* low priority xcall(9) */
                        cv_wait(&xc->xc_busy, &xc->xc_lock);
                else {
                        /* wait for all softint handlers to complete */
                        /* XXX wbat about softint of curcpu()? */

Note that this _will_ have to be fixed eventually, the current semantic for pool_cache_invalidate() is bogus: it only depletes the global cache, and not the per-CPU ones.

This forces all consumers of the API (those with their own constructor) to check the consistency of each item obtained from the pool, as it could be obtained either from a per-CPU pool (with now-invalid items) or from the global pool (and thus freshly constructed just after the invalidation).

FYI, the initial mail I sent about a month ago:
A while back I had to find a way to invalidate all allocations done in a
pool_cache(9). Unfortunately, the pool_cache_invalidate(9) function is
misleading: it only invalidates and reclaim objects cached in the
"global" lists, and not those cache in the per-cpu ones.

This has unexpected consequences; all consumers of the pool_cache(9) API
have to ensure that the objects they get out of it are still valid (see
pmap allocation in x86 pmap_create() as an example).

In my case (PAE support for Xen save/restore), page directories may be
cached in a per-cpu pool; not invalidating them upon suspend will lead
to wrong type pinning and completely break restore.

I had to introduce pool_cache_invalidate_local(9) function back then to
permit each CPU to deplete their pool and destroy all cached objects.
This made the API abstraction leaky, and was quickly backed out [2].

Unfortunately the cross-call block had to be commented out later on, as
it only allowed pools with low IPL to be managed [3]. Now that this is
fixed, I'd like to enable it again. XC_HIGHPRI makes cross calls (for
x86) via IPIs combined with a softint that runs on the targeted CPUs. So
I would expect pool_cache_invalidate() to be safe for pools used in
interrupt context too.





Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index