tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Is pool_cache_init limited to providing only one page worth of storage?
On Tue, Oct 22, 2024 at 11:18:49AM -0700, Brian Buhrow wrote:
> hello. I'm experimenting with modifying the if_xennet_xenbus driver to enlarge the pool.
> The current code allocates one page of space for the pool cache. When I try doubling that to
> 2 * PAGESIZE, I get the following panic:
>
> [ 57.0902893] panic: pr_phinpage_check: [xnfrx] item 0xffff829f53f5a000 not part of pool
>
> In looking through the sources, it looks like nothing allocates more than a page sized chunk of
> data when it calls pool_cache_init. Yet, I don't see anything in the man page that suggests
> the limit is 1 page of space. Is this a known limitation or is it a bug? If it's known, where
> is it documented?
it's a muddle.
the problem you're seeing is that the object that you're allocating from
the pool_cache (and its underlying pool) is larger that the underlying
pool's back-end allocator's "page" size. if_xennet_xenbus passes NULL
for the allocator:
if_xennetrxbuf_cache = pool_cache_init(PAGE_SIZE, 0, 0, 0,
"xnfrx", NULL, IPL_NET, NULL, NULL, NULL);
so a default allocator is chosen automatically by this code in
pool_cache_bootstrap():
if (palloc == NULL && ipl == IPL_NONE) {
if (size > PAGE_SIZE) {
int bigidx = pool_bigidx(size);
palloc = &pool_allocator_big[bigidx];
flags |= PR_NOALIGN;
} else
palloc = &pool_allocator_nointr;
}
however the ipl for this pool_cache is IPL_NET rather than IPL_NONE,
so pool_cache_bootstrap() leaves palloc NULL, and thus it defaults to
this later in pool_init():
if (palloc == NULL)
palloc = &pool_allocator_kmem;
this allocator is defined as:
struct pool_allocator pool_allocator_kmem = {
.pa_alloc = pool_page_alloc,
.pa_free = pool_page_free,
.pa_pagesz = 0
};
that pa_pagesz of 0 is converted to PAGE_SIZE the first time that
the allocator is used in pool_init():
if (palloc->pa_refcnt++ == 0) {
if (palloc->pa_pagesz == 0)
palloc->pa_pagesz = PAGE_SIZE;
I guess your kernel was built with DIAGNOSTIC disabled? there's an assertion
in pool_init() that you would have hit that would have made the problem more obvious:
KASSERTMSG((prsize <= palloc->pa_pagesz),
"%s: [%s] pool item size (%zu) larger than page size (%u)",
__func__, wchan, prsize, palloc->pa_pagesz);
as to whether this is intentional or a bug, the manpage for pool_cache_init says:
palloc
Should be typically be set to NULL, instructing
pool_cache_init() to select an appropriate back-end
allocator. Alternate allocators can be used to partition
space from arbitrary sources. Use of alternate allocators
is not documented here as it is not a stable, endorsed
part of the API.
currently you would need to provide a custom allocator to support a back-end
allocator "page" size larger than PAGE_SIZE for your non-IPL_NONE pool_cache,
so the documentation kind of says that you shouldn't do what you are trying to do.
the current rule that a non-IPL_NONE pool_cache (or any pool) will only support
object sizes up to PAGE_SIZE without a custom back-end allocator is not really
documented anywhere that I can find.
your next question will be why can't the "big" default allocators be used
for pool_caches with ipl > IPL_NONE, and I see that I was actually the one
who introduced the big allocators, for the benefit of ZFS, which allocates
objects larger than PAGE_SIZE from pool_caches via wrappers which provide
the kmem_cache* API from solaris. but I don't recall any reason why
I made the pool_cache code use those allocators by default only for pools
with ipl == IPL_NONE. the attached patch may allow your change to
if_xennet_xenbus to work the way you were hoping, please give it a try.
-Chuck
Index: src/sys/kern/subr_pool.c
===================================================================
RCS file: /home/chs/netbsd/cvs/src/sys/kern/subr_pool.c,v
retrieving revision 1.290
diff -u -p -r1.290 subr_pool.c
--- src/sys/kern/subr_pool.c 9 Apr 2023 12:21:59 -0000 1.290
+++ src/sys/kern/subr_pool.c 27 Oct 2024 21:31:54 -0000
@@ -830,8 +830,18 @@ pool_init(struct pool *pp, size_t size,
mutex_exit(&pool_head_lock);
#endif
- if (palloc == NULL)
- palloc = &pool_allocator_kmem;
+ if (palloc == NULL) {
+ if (size > PAGE_SIZE) {
+ int bigidx = pool_bigidx(size);
+
+ palloc = &pool_allocator_big[bigidx];
+ flags |= PR_NOALIGN;
+ } else if (ipl == IPL_NONE) {
+ palloc = &pool_allocator_nointr;
+ } else {
+ palloc = &pool_allocator_kmem;
+ }
+ }
if (!cold)
mutex_enter(&pool_allocator_lock);
@@ -2115,16 +2125,6 @@ pool_cache_bootstrap(pool_cache_t pc, si
unsigned int ppflags;
pp = &pc->pc_pool;
- if (palloc == NULL && ipl == IPL_NONE) {
- if (size > PAGE_SIZE) {
- int bigidx = pool_bigidx(size);
-
- palloc = &pool_allocator_big[bigidx];
- flags |= PR_NOALIGN;
- } else
- palloc = &pool_allocator_nointr;
- }
-
ppflags = flags;
if (ctor == NULL) {
ctor = NO_CTOR;
Home |
Main Index |
Thread Index |
Old Index