Subject: Re: mbuf external storage sharing
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 10/08/2004 08:42:15
In message <1097202573.157728.1552.nullmailer@yamt.dyndns.org>,
YAMAMOTO Takashi writes:
>(assuming that you mean m_getcl,)
Oops. I thought I was using the same API name as Bosko Milekic's
BSDcon paper on FreeBSD-5?
>as i wrote in the previous mail,
>the optimization, which might be good idea, does not contradict but
>is independent to what my patch does.
Sorry, but I sitll think that's incorrect. The optimizations I
envisage, if done well, eliminates any need for, or benefit from,
the majority of your patch.
>i don't understand what you mean, sorry.
>it seems that you're confused between different topics.
>
>there're (at least) two topics you're mixing up:
>
>- how to share/unshare mbuf external storage
> it's what my patch attmpts to change.
>
> both of my patch and freebsd5 allocate a small header which
> contains a reference count. no much difference here, afaik.
>
> to manipulate the reference count, freebsd5 uses atomic_ operations
> and my patch uses spinlock. i used spinlock just because we don't
> have MI-exposed atomic_ operations.
But that's not really what I'm getting at. Sure, to add a reference
to an existing external cluster, yes, you need to atomically increment
the refcount, and we can discuss the merits of spinlocks protecting an
increment, vs. explict atomic-increment :-/.
My point, however, is that when I look at the actual uses of your
fused ``atomically get an tiny-mbuf and external mbuf'', those uses
were in NIC drivers. Specifically, for acquiring new DMA buffers for
receive-side DMA queues.
In *that* case, if we allocate from per-CPU pools in the common case,
you don't need spinlocks or atomic increments *at all*, for the
initial allocation. And there are good, and very well-known reasons,
why having per-CPU pools is a Good Thing.
> (it might be good to have MI-exposed atomic_ operations like
> freebsd and linux. but it's a separate topic.)
>
>- how to allocate/free mbufs and clusters
> freebsd5 has some optimizations, including m_getcl.
> although they might be good and i have no objections against them,
> they're orthogonal to my patch.
You keep saying the issues are orthogonal. To me, it seems quite
plain that they are not orthogonal. If we do per-CPU pools, then we
can do efficient SMP-safe allocation *without* fusing a tiny-mbuf
header into a smaller variant of a cluster mbuf. (If your goal is
space efficiency, there is prior art for better approaches.)
We are clearly talking at cross-purposes, but I still don't see where
or why. Perhaps you could explain once more what the intended purpose
of your patch is?