I believe I'm seeing pretty much the same problem, on a stock NetBSD 6.1
amd64 Xen DOMU. Usually happens during /etc/daily for some reason, but
sadly not reliably, nor when I run /etc/daily in a tight loop. But it
does seem to happen every few days, and I can't drop to ddb when it does.
pool_cache_put_slow() has to allocate some administrative storage, via
vmem_alloc(), then uvm_km_kmem_alloc(). In uvm_km_kmem_alloc() it calls
first vmem_alloc() and then uvm_pagealloc(). If uvm_pagealloc() fails,
it calls vmem_free() on the allocation from vmem_alloc() and returns
ENOMEM.
The problem is that vmem_free() attempts to re-pool the allocation (if
QCACHE is defined), which starts the whole process again.
void
vmem_free(vmem_t *vm, vmem_addr_t addr, vmem_size_t size)
{
KASSERT(size > 0);
#if defined(QCACHE)
if (size <= vm->vm_qcache_max) {
int qidx = (size + vm->vm_quantum_mask) >>
vm->vm_quantum_shift;
qcache_t *qc = vm->vm_qcache[qidx - 1];
pool_cache_put(qc->qc_cache, (void *)addr);
return;
}
#endif /* defined(QCACHE) */
vmem_xfree(vm, addr, size);
}
I'm going to try the below, which has the effect of never attempting to
re-pool the freed allocation in the ENOMEM case.
Technically vmem_alloc() and vmem_xfree() should not be mixed, but in
this case I see no problem with it functionally. It's just awkward that
the documentation and a sense of aesthetics tells us not to :)
--- sys/uvm/uvm_km.c.orig 2013-12-03 16:33:14.000000000 +1300
+++ sys/uvm/uvm_km.c 2013-12-03 16:34:16.000000000 +1300
@@ -787,7 +787,7 @@
} else {
uvm_km_pgremove_intrsafe(kernel_map, va,
va + size);
- vmem_free(kmem_va_arena, va, size);
+ vmem_xfree(kmem_va_arena, va, size);
return ENOMEM;
}
}