Subject: SoC ideas - mbuf API
To: None <tech-net@netbsd.org>
From: Pavel Cahyna <pavel@netbsd.org>
List: tech-net
Date: 06/21/2006 09:47:49
I would like to present some proposals about planned modifications to the
mbuf API for my SoC project.

In short, I think that the current mbuf code makes too difficult to write
correct code, and too easy to make mistakes, as noted in this post:

http://permalink.gmane.org/gmane.os.netbsd.devel.network/7299

My proposal precises my reply in that thread:

http://permalink.gmane.org/gmane.os.netbsd.devel.network/7302

I see basically two problems with the current mbuf API:

The 1st problem: if the code does not use m_pullup/pulldown when it should, it
works in most cases, but when it encounters a packet fragmented in a right
way, it malfunctions.

Example: kern/29014.

The 2nd problem: mbufs can be read-only, and there are cases when they are
overwritten without checking for writability. See the above-referenced
thread for an example, or PR kern/33162.

Moreover, m_pullup always copies from the mbuf cluster, even if the data
are contiguous, for historical reasons, so the current code avoids calling
it if not necessary making it even more complicated. The result looks like:

	if (m->m_len < sizeof(struct ether_header)) {
		m = m_pullup(m, sizeof(struct ether_header));
		if (m == NULL)
			return NULL;
	}
	eh = mtod(m, struct ether_header *);

Such code is present in about hundred of places in the kernel.

Solution: simplify this common idiom. Let's make a mptr(m, type) macro which
would be used like:

eh = mptr_pullup(m, struct ether_header, 0)
if (eh == NULL)
	recovery from error;


where the mptr_pullup(struct mbuf *m, type, off)

and it would return the pointer to data in m starting at off, cast to
const type * (in tis case to const struct ether_header *), and make the
mbuf contiguous for sizeof (type) bytes, like m_pulldown does.

Then deprecate mtod from all the code which does not have to deal with mbuf
internals.

For the code which "knows" that the packet is already contiguous it would
be enough to have a mptr() which would not do m_pulldown but check
and panic if DIAGNOSTIC. (The goal is to pull-up early and make this the
common case.)

For most of the uses the version returning const should be enough. Code
which needs a writable pointer to the mbuf could use a mptr_rw macro,
which would do a m_makewritable on the returned region. Or such code
should be converted to m_copyback, then mptr_rw wouldn't be needed. There
shouldn't be many places where writing to mbufs is necessary.

Comments? What do people think about the naming of the proposed functions?

Pavel Cahyna