Subject: Re: mbuf external storage sharing
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 10/25/2004 11:03:44
In message <1098568486.639900.15402.nullmailer@yamt.dyndns.org>,
YAMAMOTO Takashi writes:

>hi,
>
>> > We are clearly talking at cross-purposes, but I still don't see where
>> > or why.  Perhaps you could explain once more what the intended purpose
>> > of your patch is?
>> 
>> my purposes are:
>> - being more mp-safe.
>> - preparing to implement lazy mapping of loaned pages, which can need to
>>   change a state (mapped/unmapped) of shared mbuf.
>
>have you been convinced?  or just too busy?

My sincere apologies for not responding: I was caught up with several
other things, and I'm just catching up with NetBSD mail.

As I've said: I am all for cleaner, better, safer, scalable mp-safe
code. I am not so concerned about lazy mapping of loaned pages; that's
not applicable to the traffic I currently care about.

But your `embedded header' cluster-mbuf idea struck me as a really bad idea:

1.  If the point is is better memory efficiency, they are not a good
    approach.

2.  If the point is better MP-safeness and *scalability*, then they are
    neither necessary nor sufficient.

3.  Okay, allocating m_ext headers and buffer space as virtually-contiguous
    buffer may save you one synchronization operation. But for a
    really scalable  MP-safe solution, you want the common case
    to have a marginal cost of zero synchronization operations.

4.  Allocating an m_ext header and adjacet buffer space is going
    to do at least one of:
	a) badly fragment KVA space
	b) restrict the buffer space to a non-power-of-two size,
	   causing extra allocations (or  just downright pain),
	   elsewhere;
	c) cause the data-buffers to cross page boundaries,
 	   causing yet more grief and pain in the driver code that
	   has to set up DMA.

Once we bite off designing a scalable MP-safe network buffer
allocation, the embedded m_ext is simply *not necessary* to acheive
either your goals or my goals. (See the FreeBSD-5 apporoach, either
source-code or the slightly different slant in Bosko' Milecek's BSDcon
paper, for an existence proof)

Thus, the embedded bm_ext idea is *NOT* orthogonal to a well-designed,
scalable mbuf-allocation design.