[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Fixing pool_cache_invalidate(9), final step
On Wed, 30 May 2012 22:43:14 +0200, Lars Heidieker wrote:
Well, if I read the patch right the transfer_xcall will be called
pool_drain_start and be waited for in pool_drain_end and therefor
calling the other xcall in pool_cache_invalidate that will release
per cpu cached objects directly into the pool will most likely have
work to do.
One question puzzles me right now, isn't it more efficient to do the
cache group swapping then to release the cpu locally cached objects
directly to the pool?
IMHO the bottlenecks are the ctor/dtor and not the pool_cache objects
themselves. Feeding back all objects to the pool for invalidation by the
calling CPU frees ncpu-1 CPUs from doing object destruction, except the
caller CPU which has to do all the heavy work.
This depends on how ctor/dtor locking is done. "Distributing" the same
work (object destruction) to different CPUs while using mutexes in
ctor/dtor will basically serialize the operation; true, this is
Having the cpu cached objects pushed to the pool_cache global cache
and then draining it on the calling cpu seems quite efficient to me.
Just an idea (motivated by the fact that the split of pool_drain is
ugly imho as it exposes an internal state of the draining, the xcall
arg) merge pool_drain_start and pool_drain_end and remove the xcall
from it and do the xcall from pool_cache_invalidate and probably give
it an argument if to wait aka doing it synchronous.
I will propose a patch later today including the drain_start/_end merge
with the transfer xcall.
As all cpus might have caches to release not much time will be wasted
and the will content the pool_cache global cache which should be
quicker then the contenting for the underlying pool (I haven't
There's also contention for the destruction of objects (not necessarily
heavy for a few, but there may be thousands of them cached). I would
expect this part to take longer than the contention on the underlying
The "normal" drain call from the pagedaemon might be done async (so
don't wait for a cpu busy with interrupts in that case which might
happen with low priority xcalls and which is one of your concerns if
got it right).
Except for the async idea, I had this running with high-priority
for the pool_cache destruction case we have to make sure that any
pool_cache_invalidate with it's pool_cache_invalidate_groug calls
probably xcall that do the cpu local cache to pool_cache global
transfer have finished before pool_cache_invalidate_cpu is called.
if pool_cache_xcall is called from pool_drain the we could rejoin
pool_drain_start and pool_drain_end....
The synchronous pool_cache_invalidate_<whatever> would fit such a
this should be covered with giving pool_cache_invalidate an arg.
if my grepping is right pool_reclaim/pool_cache_reclaim are only
in subr_pool.c (for _HARDKERNEL) so draining should indeed only
from the pageout daemon and not from interrupt context at all, so
should (if I'm not mistaken) be able to simplify be integrating
pool_drain_start/end and calling the pool_cache_xcall from
currently with the split pool_drain_start/end the time the xcall
might on SMP be used to drain the buffer pool, the only different
comes to my mind quickly.
I came to the same conclusion while looking at the drain_start/_end
functions, but they were outside the scope of my patch (I expected
oppositions, so kept it as small as possible).
I think this comes as a package ;-)
Just some thoughts,
Will do, however I cannot make such an API change for -6, even if it
does not appear in the documentation (I can still ask releng@ and core@,
but I highly doubt that this is worth the hassle). I can do this for
I will update both patches accordingly, but the drain start/end split
will likely remain for -6.
Main Index |
Thread Index |