malloc(9) vs kmem(9) interfaces

To: tech-kern%NetBSD.org@localhost
Subject: malloc(9) vs kmem(9) interfaces
From: Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
Date: Sat, 29 Oct 2022 14:42:27 +0000

Starting 10-15 years ago, the NetBSD kernel has been slowly migrating
from the traditional BSD malloc(9) API to the Solaris-inspired kmem(9)
API.

The main differences between the interfaces are:

malloc(9)                       kmem(9)
---------                       -------
. attribution by malloc tags    . attribution only by size -- can try
  which match in alloc and free   to use return address but it's not
                                  matched between alloc and free
. size is stored, not passed    . size is stored only for diagnostics,
  to free                         must be passed to free
. allows zero-size allocs       . forbids zero-size allocs (even
                                  though they are allowed in Solaris)

I'm not too concerned about stored- vs passed-size, and zero-size
allocations don't seem like a big deal either way (although it strikes
me as silly to have adopted a Solaris API, except incompatibly, like
we did with condvar(9)).  But attribution is a different story.

The attribution by malloc tags used to make it clear which subsystem
was responsible for memory usage, which was helpful for chasing down
leaks.  With kmem(9), such attribution requires heuristic search based
on the size and return address.

I added some dtrace probes recently to help monitor allocations by
requested size, but it still takes much more work to attribute leaks
to code responsible for them.  It took a lot of effort, for instance,
to even recognize that major leaks from radeon and nouveau came from
fence allocations, because we had to:
1. start from which _sizes_ of allocations appeared to be leaking,
   then
2. monitor stack traces with dtrace to find where many of those
   allocations were happening, and then
3. guess which ones were _leaks_ because we can't match up the alloc
   and free except by size.

I tried to search for discussion about this but haven't found anything
substantive, just commit logs recording the transition happening.  So
I wonder:

- Was the rationale migrating to kmem(9) written down or discussed
  publicly anywhere?

- What's the benefit of using kmem(9) over malloc(9)?

- Is it even worthwhile to complete this transition?

- What would the cost of restoring attribution be, other than the
  obvious O(ntag*nsizebuckets) memory cost to record it and the effort
  to annotate allocations?


Note: I'm not addressing the implementation here.  Right now they are
backed by the same array of pool caches indexed by mostly power-of-two
granularity sizes.  I'm only asking about the interface used across
the kernel and drivers.

Follow-Ups:
- Re: malloc(9) vs kmem(9) interfaces
  - From: Andrew Doran

Prev by Date: Re: Restructuring inpcb/in6pcb
Next by Date: Hunting kernel lock and interrupt latency
Previous by Thread: Good news from POSIX (sanity, finally, in one area)
Next by Thread: Re: malloc(9) vs kmem(9) interfaces
Indexes:

Home | Main Index | Thread Index | Old Index