Subject: Re: m_get(), MGET(), and MGETHDR() with M_WAIT
To: Paul Kranenburg <pk@cs.few.eur.nl>
From: Chris G Demetriou <Chris_G_Demetriou@ux2.sp.cs.cmu.edu>
List: tech-kern
Date: 06/06/1996 19:12:01
[ There is an answer at the end of all of this. ]

> But in the `m_get/MGET' case, there _is_ a spec on record. Unless someone
> digs up a conflicting one, I'd say we stick with that.

If you're speaking of TCP/IP Illustrated Volume 2, then i would say
that that spec is incorrect.  It says, and I quote (p. 42):

	Even though the caller specifies M_WAIT, the return value must
	still be checked, since, as we'll see in Figure 2.13, waiting
	for an mbuf does not guarantee that one will be available.

However, listed as the first line of the implementation of MGET is

	MALLOC(...);
	if (m) {
		...
	} else
		(m) = m_retry(...);

Figure 2.13 describes m_retry.

In other words, it says that MGET can fail because m_retry can fail,
but for the M_WAIT case that isn't true, because MALLOC is guaranteed
never to return NULL if allowed to wait.   In summary, the book says
"you need to check because of (a condition that can never happen),"
because it seems to miss a vital part of the definition of the kernel
memory allocator.


That having been said, I dug in to the 4.3 and 4.4 daemon books (i
bought the latter yesterday... does anybody else think that its cover
is one of the ugliest they've ever seen? 8-), and there is some text,
but it's not specific.

Quoting the 4.3 daemon book (page 291):

	[ ... ]  Mbuf-allocation requests indicate that they must be
	filfilled immediately or that they can wait for available
	resources.  If a request is marked as ``can wait'' and
	the requested resources are unavailable, the process is
	put to sleep to await available reousrces.  [ ... ]

	An mbuf-allocation request is made through a call to m_get(),
	m_getclr(), or one of the equivalent macros used for
	efficiency purposes.  [ ... ]

It doesn't explicitly say that m_get() will never return NULL, but
could be read to mean that.

The 4.4 daemon book (page 373) says explicitly that non-blocking
requests will fail if there are no resources, but doesn't say what
happens to blocking requests.  Again, this could be read to mean that
MGET(), regardless of implementation, should never return NULL.
However, the surrounding text really does seem like a "this is the
implementation, not a definition" passage, so I'm not particularly
satisfied with it.


So, off i went, looking for mbuf references.


I found the 4.2BSD Networking Implementation Notes (Revised July,
1983), by Leffler, Joy, and Fabry, in the UCB CS tech report archive
(number csd-83-146).  It defines the mbuf structures and basic utility
routines (e.g. m_copy(), m_pullup()), but doesn't actually define
MGET or m_get.

Actually, that omission annoyed me...  what's the point of telling
what you can do with them if you don't bother saying how you can get
them?


Anyway, at that point, i started digging in various sources I have
access to (Mach, BSD) to find out the answer.


What I found:

(1) the M_WAIT/M_DONTWAIT distinction was added to the mbuf code
    by Mike Karels in 1985.  Even if MGET/m_get had been
    documented in the tech report, the documentation wouldn't have
    been useful.  8-) 

(2) In the initial implementation, MGET/m_get with M_WAIT would never,
    ever return NULL.  It would sleep for as long as necessary, and
    in the case of hopeless resource exhaustion (the equivalent of the
    cause of the current code's "mb_map full" panic), panic.

(3) As far as I can tell, in subsequent implementations (even e.g.
    in Mach), the notion that MGET/m_get with M_WAIT would never,
    ever return NULL was true.


Given all of that, unless somebody has a strong counterexample that
says MGET/m_get with M_WAIT can return NULL in a BSD kernel, i'm
willing to say:
	(1) TCP/IP Illustrated, Volume 2 is incorrect to say that
	    it can, and
	(2) checks for NULL returns from MGET/m_get/m_gethdr
	    invocations with M_WAIT are wasteful and should be
	    removed.


takers?  8-)



chris