Subject: CMSG_* problems
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 02/11/2007 01:00:55
I started looking at some stuff relating to passing access rights
through AF_LOCAL sockets.  This brought me up against the CMSG_* mess.

I've come to the conclusion that this API is rather broken - the most
charitable I can be towards it is "not very well thought out".  I'm
writing here both to check my work - to ask whether there's something
lurking somewhere I missed that means the flaws I see aren't really
flaws at all - and to have a bash at coming up with improvements.

Specifically, it seems to me that the only ways to use the API without
making assumptions not promised by C involve requiring that the
msg_control buffer be suitably aligned for a struct cmsghdr, which
basically means that it must be malloc()ed, and malloc()ed specifically
for the purpose (not a non-initial part of a larger malloc()ed buffer).

This is because CMSG_DATA is the only provided way to find out where
the data for a control message lives, but CMSG_DATA works only if you
give it a struct cmsghdr, and then it works only assuming the data
follows it in memory the way it does in the control message.  If you
don't assume the control buffer is aligned, and you copy a struct
cmsghdr's worth of bytes into a struct cmsghdr (which is the right way
to deal with possible misalignment), then using CMSG_DATA on that may
generate a pointer past the end of the object, something not, strictly,
permitted in C.  (If the machine's choice of alignment requirements for
the CMSG_* interface, and struct padding conventions, collaborate
appropriately, the pointer may be only *just* past the end of the
object and thus legal, but this is not promised.)

Alternatively, and apparently the way RFC 2292 and the CMSG_* macros
appear to expect, you can cast your buffer pointer to a struct cmsghdr
pointer - at which point you must make sure it's aligned.

When I've had to write to the CMSG_* interface, I usually end up
assuming I can use pointers past the end of an object, locating the
data with something like

	bp + ((char *)CMSG_DATA(&cmh) - (char *)&cmh)

(where cmh is the struct cmsghdr I've copied the header into).

CMSG_FIRSTHDR and CMSG_NXTHDR have similar problems, because they too
assume they are being used on structs cmsghdr embedded in a control
message buffer.  Fortunately, there is CMSG_SPACE, which allows you to
walk the buffer yourself.

Thus, my first question: is the above analysis missing anything?

If not, my proposal: the creation of macros akin to CMSG_DATA and
CMSG_NXTHDR which don't return pointers, but instead, take a struct
cmsghdr and return the distance from its beginning in the buffer to the
beginning of its data (CMSG_DATA-alike) or next cmsghdr
(CMSG_NXTHDR-alike).  For the sake of concreteness, I suggest
CMSG_DATASKIP and CMSG_NXTSKIP as their names, though I'm by no means
wedded to those and would cheerfully entertain alternatives.

The only complication I see is the case where CMSG_NXTHDR would return
a null pointer.  Since my proposed amount-to-skip macro cannot know
from just the cmsghdr where the input cmsghdr falls in the buffer (and
indeed it may not be in a buffer yet, as when constructing messages), I
propose it take only the cmsghdr, not the msghdr, and not do any checks
for running past the end of the buffer.  (In passing, I think our
current implementation - and the sample implementation given in the RFC
- is buggy, in that it will fail to return a null pointer if the last
control message ends exactly at the end of the control message buffer,
without padding.  The > really needs to be >=.  Our implementation also
does not handle a null pointer second argument as specified in 2292
section 4.3.2.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B